【发布时间】:2016-11-28 22:00:09
【问题描述】:
我在 MySQL 中有下表:
CREATE TABLE `events` (
`pv_name` varchar(60) COLLATE utf8mb4_bin NOT NULL,
`time_stamp` bigint(20) unsigned NOT NULL,
`event_type` varchar(40) COLLATE utf8mb4_bin NOT NULL,
`has_data` tinyint(1) NOT NULL,
`data` json DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin ROW_FORMAT=COMPRESSED;
ALTER TABLE `events`
ADD PRIMARY KEY (`pv_name`,`time_stamp`),
ADD UNIQUE KEY `has_data` (`pv_name`,`has_data`,`time_stamp`);
我试图找到一组不同的 pv_names,这些 pv_names 在两个给定时间之间有一些没有数据的行。以下两个查询似乎都会返回此信息:
mysql> EXPLAIN SELECT pv_name FROM events
WHERE has_data = 0
AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999
GROUP BY events.pv_name;
+----+-------------+--------+------------+-------+------------------+----------+---------+------+---------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+---------+----------+--------------------------+
| 1 | SIMPLE | events | NULL | index | PRIMARY,has_data | has_data | 251 | NULL | 1855281 | 1.11 | Using where; Using index |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+---------+----------+--------------------------+
mysql> EXPLAIN SELECT pv_name, MAX(events.time_stamp) FROM events
WHERE has_data = 0
AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999
GROUP BY events.pv_name;
+----+-------------+--------+------------+-------+------------------+----------+---------+------+--------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+--------+----------+---------------------------------------+
| 1 | SIMPLE | events | NULL | range | PRIMARY,has_data | has_data | 251 | NULL | 203123 | 100.00 | Using where; Using index for group-by |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+--------+----------+---------------------------------------+
我不明白为什么第二个查询对返回的内容有额外的限制(我不需要),它的运行时间似乎比第一个要短。有没有办法在不聚合 time_stamp 列的情况下改进第一个查询以匹配第二个查询的效率?
编辑:
根据 Rick James 的建议,我更改了 has_data 索引:
ALTER TABLE `events`
ADD PRIMARY KEY (`pv_name`,`time_stamp`), ADD KEY `has_data` (`has_data`,`pv_name`,`time_stamp`);
这将查询报告更改为:
mysql> EXPLAIN SELECT pv_name FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| 1 | SIMPLE | events | NULL | ref | PRIMARY,has_data | has_data | 1 | const | 267096 | 11.11 | Using where; Using index |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT pv_name, MAX(events.time_stamp) FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| 1 | SIMPLE | events | NULL | ref | PRIMARY,has_data | has_data | 1 | const | 267096 | 11.11 | Using where; Using index |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
1 row in set, 1 warning (0.01 sec)
这似乎运行得更快。
编辑:
Rick James 要求的测试结果:
mysql> FLUSH STATUS;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT pv_name FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
.
.
.
114480 rows in set (0.34 sec)
mysql> SHOW SESSION STATUS LIKE 'Handler%';
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 2 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 125527 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+--------+
18 rows in set (0.01 sec)
mysql> SELECT COUNT(*) FROM events;
+----------+
| COUNT(*) |
+----------+
| 3683887 |
+----------+
1 row in set (11.66 sec)
编辑:
运行时间:
mysql> SHOW INDEXES FROM events;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| events | 0 | PRIMARY | 1 | pv_name | A | 216061 | NULL | NULL | | BTREE | | |
| events | 0 | PRIMARY | 2 | time_stamp | A | 4450791 | NULL | NULL | | BTREE | | |
| events | 1 | has_data | 1 | has_data | A | 258 | NULL | NULL | | BTREE | | |
| events | 1 | has_data | 2 | pv_name | A | 496542 | NULL | NULL | | BTREE | | |
| events | 1 | has_data | 3 | time_stamp | A | 4390035 | NULL | NULL | | BTREE | | |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
5 rows in set (0.00 sec)
mysql> EXPLAIN SELECT events.pv_name FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| 1 | SIMPLE | events | NULL | ref | PRIMARY,has_data | has_data | 1 | const | 267096 | 11.11 | Using where; Using index |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT events.pv_name, MAX(events.time_stamp) FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
| 1 | SIMPLE | events | NULL | ref | PRIMARY,has_data | has_data | 1 | const | 267096 | 11.11 | Using where; Using index |
+----+-------------+--------+------------+------+------------------+----------+---------+-------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
SELECT events.pv_name FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
114480 rows in set (0.37 sec)
SELECT events.pv_name, MAX(events.time_stamp) FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
114480 rows in set (0.30 sec)
mysql> SHOW INDEXES FROM events;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| events | 0 | PRIMARY | 1 | pv_name | A | 422951 | NULL | NULL | | BTREE | | |
| events | 0 | PRIMARY | 2 | time_stamp | A | 4321990 | NULL | NULL | | BTREE | | |
| events | 0 | has_data | 1 | pv_name | A | 240067 | NULL | NULL | | BTREE | | |
| events | 0 | has_data | 2 | has_data | A | 436525 | NULL | NULL | | BTREE | | |
| events | 0 | has_data | 3 | time_stamp | A | 4205163 | NULL | NULL | | BTREE | | |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
5 rows in set (0.00 sec)
mysql> EXPLAIN SELECT events.pv_name FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
+----+-------------+--------+------------+-------+------------------+----------+---------+------+---------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+---------+----------+--------------------------+
| 1 | SIMPLE | events | NULL | index | PRIMARY,has_data | has_data | 251 | NULL | 4462633 | 1.11 | Using where; Using index |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+---------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT events.pv_name, MAX(events.time_stamp) FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
+----+-------------+--------+------------+-------+------------------+----------+---------+------+--------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+--------+----------+---------------------------------------+
| 1 | SIMPLE | events | NULL | range | PRIMARY,has_data | has_data | 251 | NULL | 240076 | 100.00 | Using where; Using index for group-by |
+----+-------------+--------+------------+-------+------------------+----------+---------+------+--------+----------+---------------------------------------+
1 row in set, 1 warning (0.00 sec)
SELECT events.pv_name FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
114480 rows in set (6.79 sec)
SELECT events.pv_name, MAX(events.time_stamp) FROM events WHERE has_data = 0 AND events.time_stamp > 0 AND events.time_stamp < 9999999999999999999 GROUP BY events.pv_name;
114480 rows in set (2.65 sec)
【问题讨论】:
-
如果在表已将数据读入缓冲池后重复这两个查询,这两个查询是否都快?通常情况下,第一次运行查询时,它会比随后运行相同查询的速度慢,因为它必须填充缓冲池。
标签: mysql performance group-by