【问题标题】：MySQL query optimiser showing random behaviour for a query on table with primary and composite indexesMySQL 查询优化器显示对具有主索引和复合索引的表的查询的随机行为
【发布时间】：2019-10-04 09:14:09
【问题描述】：

我有一个正在执行查询的 MySQL 表。在某些情况下，查询需要很长时间 ~ 15 分钟才能返回结果，但在其他情况下，它会在几毫秒内返回结果。这两个查询仅在 where 子句中某一列的值不同。

表格语法

CREATE TABLE `tests` (
  `id` varchar(36) NOT NULL,
  `some_other_id` varchar(36) NOT NULL,
  `col_1` varchar(64) NOT NULL,
  `col_2` varchar(128) DEFAULT NULL,
  `col_3` varchar(64) DEFAULT NULL,
  `status` varchar(32) NOT NULL,
  `created_at_epoch` bigint(20) NOT NULL,
  `updated_at_epoch` bigint(20) NOT NULL,
  `updated_by` varchar(64) NOT NULL,
  `version` int(11) NOT NULL,
  `col_4` text,
  `col_5` varchar(64) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `some_other_id_col_1_col_2_idx` (`some_other_id`,`col_1`,`col_2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

id 和 some_other_id 是使用时间戳和之后的随机字符创建的，some_other_id 的示例是“ 15632901521370150qGUCAQpVuUWK-bJg"

该表有大约 6000 万条记录和大约 56 GB 的数据。

请注意以下查询中 some_other_id 的值。

select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_1**' and (test.status in ('activated')) and test.id>='' order by test.id limit 2;
--Executes within milliseconds.
--Explain plan gives key as "some_other_id_col_1_col_2_idx".

select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_1**' and (test.status in ('activated')) and test.id>='' order by test.id limit 1;
--Takes ~14-15 minutes.
--Explain plan gives key as "PRIMARY".

select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_1**' and (test.status in ('activated')) and test.id>='' order by test.id limit 3;
--Executes within milliseconds.
--Explain plan gives key as "some_other_id_col_1_col_2_idx".

select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_2**' and (test.status in ('activated')) and test.id>='' order by test.id limit 2;
--Takes ~14-15 minutes.
--Explain plan gives key as "PRIMARY".

select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_2**' and (test.status in ('activated')) order by test.id limit 2;
--Takes ~14-15 minutes.
--Explain plan gives key as "PRIMARY".

select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_2**' and (test.status in ('activated')) and test.id>='' limit 2;
--Executes within milliseconds.
--Explain plan gives key as "some_other_id_col_1_col_2_idx".

我无法理解这里的行为，我正在寻找一些关于如何发生这种情况的解释。我正在使用 MySQL 5.6

【问题讨论】：

因为 InnoDB 将 col_4 文本存储在与其余表数据分开的单独表中，这意味着需要额外的磁盘 i/o 来检查/获取文本数据...但是服务器是否专用于运行MySQL 只是或者是 Amazon 或 Google 上基于云的 MySQL 服务器，因为在几毫秒到 15 分钟内基本上读取相同的 RAM/磁盘页面的那些“随机”巨大差异，因为性能的时间差异非常奇怪..跨度>
你能把@RickJames 也建议的完整解释和/或EXPLAIN FORMAT=JSON SELECT .. 结构吗？

标签： mysql optimization indexing query-optimization

【解决方案1】：

添加这个复合索引：

INDEX(status, some_other_id, id)  -- in this order

对于 56GB 的数据，您应该认真考虑规范化和其他缩小表大小的技术。 status 是此类的主要候选人。 TINYINT UNSIGNED 只占用 1 个字节并提供 256 个值。 ENUM 可能是一个可行的选择。

updated_by 是另一个可能缩小的东西。

如果那些 epochs 只是到秒，不要使用 8 字节的 BIGINT。

要进一步调查性能异常，请为每个异常提供EXPLAIN FORMAT=JSON SELECT ...，以及“优化器跟踪”。

【讨论】：