【发布时间】:2016-03-21 20:20:26
【问题描述】:
假设我在 MySQL 中有下表:
CREATE TABLE `events` (
`pv_name` varchar(60) COLLATE utf8mb4_unicode_ci NOT NULL,
`time_stamp` bigint(20) unsigned NOT NULL,
`event_type` varchar(40) COLLATE utf8mb4_unicode_ci NOT NULL,
`value` text CHARACTER SET utf8mb4 COLLATE utf8mb4_bin,
`value_type` varchar(40) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`value_count` bigint(20) DEFAULT NULL,
`alarm_status` varchar(40) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`alarm_severity` varchar(40) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`pv_name`,`time_stamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED;
有什么方法可以通过索引或其他方式改进以下查询?
SELECT DISTINCT events.pv_name
FROM events
WHERE events.time_stamp > t0_in AND events.time_stamp < t1_in
AND (events.value IS NULL OR events.alarm_severity = 'INVALID');
t0_in 和t1_in 是传递给定义查询的存储过程的参数。
使用 EXPLAIN 运行查询给出:
+----+-------------+--------+-------+---------------+---------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+----------+-------------+
| 1 | SIMPLE | events | index | PRIMARY | PRIMARY | 250 | NULL | 12724016 | Using where |
+----+-------------+--------+-------+---------------+---------+---------+------+----------+-------------+
在数据库上运行查询在 1 分 50.93 秒内返回 102620 行。
更新
为简单起见,假设表格如下:
CREATE TABLE `events` (
`pv_name` varchar(60) COLLATE utf8mb4_unicode_ci NOT NULL,
`time_stamp` bigint(20) unsigned NOT NULL,
`value_valid` tinyint(1) NOT NULL,
PRIMARY KEY (`pv_name`,`time_stamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED;
是否可以添加适当的索引,以便以下或等效查询使用松散的索引扫描优化?
SELECT DISTINCT events.pv_name
FROM events
WHERE events.time_stamp > t0_in AND events.time_stamp < t1_in
AND events.value_valid = 0);
更新
如果我在time_stamp 上添加索引,我会得到:
mysql> EXPLAIN SELECT DISTINCT events.pv_name FROM events WHERE events.time_stamp > 0 AND events.time_stamp < 11426224880000000000 AND (events.value IS NULL OR events.alarm_severity = 'INVALID');
+----+-------------+--------+-------+--------------------+---------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+--------------------+---------+---------+------+----------+-------------+
| 1 | SIMPLE | events | index | PRIMARY,time_stamp | PRIMARY | 250 | NULL | 13261211 | Using where |
+----+-------------+--------+-------+--------------------+---------+---------+------+----------+-------------+
在数据库上运行此查询在 30.44 秒内返回 11511 行。
mysql> EXPLAIN SELECT DISTINCT events.pv_name FROM events FORCE INDEX (time_stamp) WHERE events.time_stamp > 0 AND events.time_stamp < 11426224880000000000 AND (events.value IS NULL OR events.alarm_severity = 'INVALID');
+----+-------------+--------+-------+--------------------+------------+---------+------+---------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+--------------------+------------+---------+------+---------+-----------------------------------------------------+
| 1 | SIMPLE | events | range | PRIMARY,time_stamp | time_stamp | 8 | NULL | 6630605 | Using index condition; Using where; Using temporary |
+----+-------------+--------+-------+--------------------+------------+---------+------+---------+-----------------------------------------------------+
在数据库上运行此查询在 2 分 20.41 秒内返回 11511 行。
更新
根据我已将表格更改为的建议:
CREATE TABLE `events` (
`pv_name` varchar(60) COLLATE utf8mb4_unicode_ci NOT NULL,
`time_stamp` bigint(20) unsigned NOT NULL,
`event_type` enum('add','init','update','disconnect','remove') COLLATE utf8mb4_unicode_ci NOT NULL,
`value` text CHARACTER SET utf8mb4 COLLATE utf8mb4_bin,
`value_type` varchar(40) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`value_count` bigint(20) DEFAULT NULL,
`alarm_status` enum('NO_ALARM','READ','WRITE','HIHI','HIGH','LOLO','LOW','STATE','COS','COMM','TIMEOUT','HWLIMIT','CALC','SCAN','LINK','SOFT','BAD_SUB','UDF','DISABLE','SIMM','READ_ACCESS','WRITE_ACCESS') COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`alarm_severity` enum('NO_ALARM','MINOR','MAJOR','INVALID') COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`pv_name`,`time_stamp`),
KEY `event_type` (`event_type`,`time_stamp`),
KEY `alarm_severity` (`alarm_severity`,`time_stamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED;
查询到:
SELECT DISTINCT events.pv_name
FROM events
WHERE events.time_stamp > 0 AND events.time_stamp < 1426224880000000000
AND alarm_severity = 'INVALID'
UNION
SELECT DISTINCT events.pv_name
FROM events
WHERE events.time_stamp > 0 AND events.time_stamp < 1426224880000000000
AND event_type = 'add'
UNION
SELECT DISTINCT events.pv_name
FROM events
WHERE events.time_stamp > 0 AND events.time_stamp < 1426224880000000000
AND event_type = 'disconnect'
UNION
SELECT DISTINCT events.pv_name
FROM events
WHERE events.time_stamp > 0 AND events.time_stamp < 1426224880000000000
AND event_type = 'remove';
对查询运行解释给出:
+----+--------------+----------------+-------+-----------------------------------+----------------+---------+------+--------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+----------------+-------+-----------------------------------+----------------+---------+------+--------+-------------------------------------------+
| 1 | PRIMARY | events | range | PRIMARY,event_type,alarm_severity | alarm_severity | 10 | NULL | 101670 | Using where; Using index; Using temporary |
| 2 | UNION | events | range | PRIMARY,event_type,alarm_severity | event_type | 9 | NULL | 994652 | Using where; Using index; Using temporary |
| 3 | UNION | events | range | PRIMARY,event_type,alarm_severity | event_type | 9 | NULL | 73660 | Using where; Using index; Using temporary |
| 4 | UNION | events | range | PRIMARY,event_type,alarm_severity | event_type | 9 | NULL | 136348 | Using where; Using index; Using temporary |
| NULL | UNION RESULT | <union1,2,3,4> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+----------------+-------+-----------------------------------+----------------+---------+------+--------+-------------------------------------------+
在数据库上运行查询在 1 分 2.45 秒内返回 112620 行。
【问题讨论】:
-
你能提供一个带有一点数据的sqlfiddle吗?
-
整张桌子有多大?
-
该表目前大约有 12,000,000 行,并将稳步增长。
-
@Loufylouf:我不太熟悉 sqlfiddle。表中没有大量行是否具有代表性?
-
这将比尝试手动执行此操作要好,并且解释仍然可以工作,因此它不会那么重要,但仍然有用。
标签: mysql query-optimization distinct