【问题标题】:MySQL Optimize 500M+ row tableMySQL优化500M+行表
【发布时间】:2021-10-27 17:07:57
【问题描述】:

我有一个相当简单的表,包含 5 亿多行。非常简单的查询需要 2-3.5 分钟。我在 WHERE 语句中的字段上有一个索引。

我想知道如何优化此表和/或查询?

查询和结果

mysql> SELECT COUNT(emails_id) AS count FROM person_deliveries WHERE DATE(date) = '2021-08-23' ;
+--------+
| count  |
+--------+
| 539438 |
+--------+
1 row in set (2 min 20.05 sec)

解释查询

mysql> EXPLAIN SELECT COUNT(emails_id) AS count FROM person_deliveries WHERE DATE(date) = '2021-08-23' ;
+----+-------------+-------------------+------------+-------+---------------+----------------------+---------+------+-----------+----------+--------------------------+
| id | select_type | table             | partitions | type  | possible_keys | key                  | key_len | ref  | rows      | filtered | Extra                    |
+----+-------------+-------------------+------------+-------+---------------+----------------------+---------+------+-----------+----------+--------------------------+
|  1 | SIMPLE      | person_deliveries | NULL       | index | NULL          | campaigns_id | 4       | NULL | 454956815 |   100.00 | Using where; Using index |
+----+-------------+-------------------+------------+-------+---------------+----------------------+---------+------+-----------+----------+--------------------------+

显示创建表

-------------------------------------------------------------------------------------------------------+
| Table             | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| person_deliveries | CREATE TABLE `person_deliveries` (
  `emails_id` int unsigned NOT NULL,
  `campaigns_id` int NOT NULL,
  `date` datetime NOT NULL,
  `vmta` varchar(255) DEFAULT NULL,
  `ip_address` varchar(15) DEFAULT NULL,
  `domain` varchar(255) DEFAULT NULL,
  UNIQUE KEY `person_campaign_date` (`emails_id`,`campaigns_id`,`date`),
  KEY `ip_address` (`ip_address`),
  KEY `domain` (`domain`),
  KEY `campaigns_id` (`campaigns_id`),
  KEY `date` (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

提前谢谢你!

【问题讨论】:

  • 虽然我相信这肯定是这里的主题,但在这种规模下,您可能会发现删除这篇文章并在 Stack Overflow 的姊妹网站Database Administrators 上重新提问会更有益
  • 已经有一些关于该主题的内容,但请参阅dev.mysql.com/doc/refman/8.0/en/…
  • 计数操作在大型集上可能会异常缓慢。可能相关/相关:stackoverflow.com/q/511820/1370000
  • 我认为使用DATE(date) = '2021-08-23' 是个问题,它降低了索引的好处。试试date >= '2021-08-23 00:00:00' AND date <= '2021-08-23 59:59:59' 并告诉我们。
  • en.wikipedia.org/wiki/Sargable - 正如 Silvanu 所评论的那样,无论何时你对一个值使用函数,你都会失去对索引的优化。

标签: mysql optimization


【解决方案1】:

正如 Silvanu 和 DRapp 评论的那样,我使用 date() 函数会减慢查询速度。

mysql> SELECT COUNT(emails_id) AS count FROM person_deliveries WHERE date >= '2021-08-23 00:00:00' AND date <= '2021-08-23 23:59:59' ;
+--------+
| count  |
+--------+
| 539438 |
+--------+
1 row in set (***0.47 sec***)

【讨论】:

  • 这个答案是正确的,但将最后一个谓词更改为:AND date &lt; '2021-08-24 00:00:00' 总是更安全。
猜你喜欢
  • 1970-01-01
  • 2011-02-11
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2015-06-22
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多