【发布时间】:2012-01-08 14:08:45
【问题描述】:
我有一个表statistics 具有下一个结构:
+-------------------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created_at | datetime | YES | MUL | NULL | |
| year_in_tz | smallint(5) unsigned | YES | MUL | NULL | |
| month_in_tz | tinyint(3) unsigned | YES | MUL | NULL | |
+-------------------+----------------------+------+-----+---------+----------------+
在 created_at、year_in_tz、month_in_tz 和(year_in_tz、month_in_tz)上使用键:
ALTER TABLE `statistics` ADD INDEX created_at (created_at);
alter table statistics add index year_in_tz (year_in_tz);
alter table statistics add index month_in_tz (month_in_tz);
alter table statistics add index year_month_in_tz(year_in_tz,month_in_tz);
一些查询示例...
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (5.04 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | NULL | year_month_in_tz | 5 | NULL | 7797984 | Using index |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz
FROM `statistics`
WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59')
GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (1 min 33.46 sec)
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
| 1 | SIMPLE | statistics | index | created_at | year_month_in_tz | 5 | NULL | 7797984 | Using where |
+----+-------------+--------------------+-------+---------------+------------------+---------+------+---------+-------------+
1 row in set (0.07 sec)
因此,如果我在索引列上使用带有子句的 where 语句 + 按索引列分组,则速度极低。 也许有人知道如何改进最后一个查询以使其更快?
P.S. 在玩过索引之后,我发现 (created_at, year_in_tz, month_in_tz) 上的新索引使查询运行得更快,但我希望每次查询 0-1 秒,而不是 10 秒:
alter table lending_statistics add index created_at_with_year_and_month_in_tz (created_at,year_in_tz,month_in_tz);
mysql> describe SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | statistics | range | created_at,created_at_with_year_and_month_in_tz | created_at_with_year_and_month_in_tz | 9 | NULL | 3612208 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+--------------------+-------+-------------------------------------------------+--------------------------------------+---------+------+---------+-----------------------------------------------------------+
1 行(0.05 秒)
mysql> SELECT COUNT(*) AS count_all, year_in_tz, month_in_tz FROM `lending_statistics` WHERE (created_at BETWEEN '2011-10-31 20:00:00' AND '2011-12-31 19:59:59') GROUP BY year_in_tz, month_in_tz;
+-----------+------------+-------------+
| count_all | year_in_tz | month_in_tz |
+-----------+------------+-------------+
| 467890 | 2011 | 11 |
| 7339389 | 2011 | 12 |
+-----------+------------+-------------+
2 rows in set (10.62 sec)
【问题讨论】:
-
只是好奇;因为 year_in_Tz 在您的示例中将是相同的,如果您按照 article 从组中省略它会发生什么
-
xQbert,没有任何反应,但感谢您提供关于查询优化的好主意(如果选择范围为一年,则从组中省略 year_in_tz)。
-
这只是上面文章中的一个想法:您可以使用此功能通过避免不必要的列排序和分组来获得更好的性能。但是,这主要是在每个未在 GROUP BY 中命名的非聚合列中的所有值对于每个组都相同时很有用。”我和你现在一样茫然
-
刚刚将最后一个描述更改为正确的...最后一个用于另一个查询。
-
你能列出你创建的键的定义吗?
标签: mysql indexing group-by innodb