【问题标题】:MySQL Select with several joins doing full table scan on joined table firstMySQL Select 具有多个连接,首先对连接表进行全表扫描
【发布时间】:2016-03-10 16:23:37
【问题描述】:

我有以下疑问:

SELECT 
  Impressions.id AS `Impressions__id`, 
  Impressions.timestamp AS `Impressions__timestamp`, 
  Impressions.name AS `Impressions__name`, 
  Impressions.lat AS `Impressions__lat`, 
  Impressions.lng AS `Impressions__lng`, 
  Impressions.personas_count AS `Impressions__personas_count`, 
  Impressions.modified AS `Impressions__modified`, 
  Beacons.id AS `Beacons__id`, 
  Beacons.uuid AS `Beacons__uuid`, 
  Beacons.major AS `Beacons__major`, 
  Beacons.minor_dec AS `Beacons__minor_dec`, 
  Beacons.minor_hex AS `Beacons__minor_hex`, 
  Beacons.impressions_count AS `Beacons__impressions_count`, 
  Beacons.created AS `Beacons__created`, 
  Beacons.modified AS `Beacons__modified`, 
  Zones.id AS `Zones__id`, 
  Zones.location_id AS `Zones__location_id`, 
  Zones.beacon_id AS `Zones__beacon_id`, 
  Zones.fixture_no AS `Zones__fixture_no`, 
  Zones.placement AS `Zones__placement`, 
  Zones.floor AS `Zones__floor`, 
  Zones.impressions_count AS `Zones__impressions_count`, 
  Zones.ignore_further_incidents AS `Zones__ignore_further_incidents`, 
  Zones.is_reviewed AS `Zones__is_reviewed`, 
  Zones.review_date AS `Zones__review_date`, 
  Zones.created AS `Zones__created`, 
  Zones.modified AS `Zones__modified`, 
  Locations.id AS `Locations__id`, 
  Locations.retailer_id AS `Locations__retailer_id`, 
  Locations.google_place_id AS `Locations__google_place_id`, 
  Locations.regional_name AS `Locations__regional_name`, 
  Locations.location AS `Locations__location`, 
  Locations.store_no AS `Locations__store_no`, 
  Locations.lat AS `Locations__lat`, 
  Locations.lng AS `Locations__lng`, 
  Locations.address1 AS `Locations__address1`, 
  Locations.address2 AS `Locations__address2`, 
  Locations.address3 AS `Locations__address3`, 
  Locations.city AS `Locations__city`, 
  Locations.state AS `Locations__state`, 
  Locations.postal_code AS `Locations__postal_code`, 
  Locations.region_id AS `Locations__region_id`, 
  Locations.country_id AS `Locations__country_id`, 
  Locations.zones_count AS `Locations__zones_count`, 
  Locations.contacts_count AS `Locations__contacts_count`, 
  Locations.created AS `Locations__created`, 
  Locations.modified AS `Locations__modified`, 
  Devices.id AS `Devices__id`, 
  Devices.os AS `Devices__os`, 
  Devices.bluetooth_enabled AS `Devices__bluetooth_enabled`, 
  Devices.impressions_count AS `Devices__impressions_count`, 
  Devices.modified AS `Devices__modified`, 
  Regions.id AS `Regions__id`, 
  Regions.country_name AS `Regions__country_name`, 
  Regions.subdiv AS `Regions__subdiv`, 
  Regions.subdiv_name AS `Regions__subdiv_name`, 
  Regions.level_name AS `Regions__level_name`, 
  Regions.alt_names AS `Regions__alt_names`, 
  Regions.subdiv_star AS `Regions__subdiv_star`, 
  Regions.subdiv_id AS `Regions__subdiv_id`, 
  Regions.country_id AS `Regions__country_id`, 
  Regions.country_code_2 AS `Regions__country_code_2`, 
  Regions.country_code_3 AS `Regions__country_code_3`, 
  Countries.id AS `Countries__id`, 
  Countries.country_name AS `Countries__country_name`, 
  Countries.alt_names AS `Countries__alt_names`, 
  Countries.code2 AS `Countries__code2`, 
  Countries.code3 AS `Countries__code3`, 
  Countries.iso_cc AS `Countries__iso_cc`, 
  Countries.fips_code AS `Countries__fips_code`, 
  Countries.fips_country_name AS `Countries__fips_country_name`, 
  Countries.un_region AS `Countries__un_region`, 
  Countries.un_subregion AS `Countries__un_subregion`, 
  Countries.comments AS `Countries__comments`, 
  Countries.created AS `Countries__created`, 
  Countries.modified AS `Countries__modified` 
FROM 
  impressions Impressions 
  inner join beacons Beacons ON Beacons.id = (Impressions.beacon_id) 
  inner JOIN zones Zones ON Zones.id = (Impressions.zone_id) 
  inner JOIN devices Devices ON Devices.id = (Impressions.device_id) 
  INNER JOIN locations Locations ON Locations.id = (Zones.location_id) 
  LEFT JOIN regions Regions ON Regions.id = (Locations.region_id) 
  LEFT JOIN countries Countries ON Countries.id = (Locations.country_id) 
ORDER BY 
  Impressions.timestamp desc 
LIMIT 
  15 OFFSET 15

此查询大约需要 6 秒才能运行。 EXPLAIN 输出如下:

+----+-------------+-------------+--------+---------------------------------------+----------------+---------+---------------------------------+-------+---------------------------------+
| id | select_type | table       | type   | possible_keys                         | key            | key_len | ref                             | rows  | Extra                           |
+----+-------------+-------------+--------+---------------------------------------+----------------+---------+---------------------------------+-------+---------------------------------+
|  1 | SIMPLE      | Devices     | ALL    | PRIMARY                               | NULL           | NULL    | NULL                            | 43274 | Using temporary; Using filesort |
|  1 | SIMPLE      | Impressions | ref    | zone_idx,device_id_idx2,beacon_id_idx | device_id_idx2 | 8       | gen1_d2go.Devices.id            |     3 | NULL                            |
|  1 | SIMPLE      | Zones       | eq_ref | PRIMARY,fk_location_idx,comp          | PRIMARY        | 8       | gen1_d2go.Impressions.zone_id   |     1 | NULL                            |
|  1 | SIMPLE      | Beacons     | eq_ref | PRIMARY                               | PRIMARY        | 8       | gen1_d2go.Impressions.beacon_id |     1 | NULL                            |
|  1 | SIMPLE      | Locations   | eq_ref | PRIMARY                               | PRIMARY        | 8       | gen1_d2go.Zones.location_id     |     1 | NULL                            |
|  1 | SIMPLE      | Regions     | eq_ref | PRIMARY                               | PRIMARY        | 4       | gen1_d2go.Locations.region_id   |     1 | NULL                            |
|  1 | SIMPLE      | Countries   | eq_ref | PRIMARY                               | PRIMARY        | 4       | gen1_d2go.Locations.country_id  |     1 | NULL                            |
+----+-------------+-------------+--------+---------------------------------------+----------------+---------+---------------------------------+-------+---------------------------------+
7 rows in set (0.00 sec)

我不明白为什么它倾向于对Devices 表进行全面扫描。表都被索引了,ImpressionsDevicesCREATE语句如下:

展示次数

CREATE TABLE `impressions` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `device_id` bigint(20) unsigned NOT NULL,
  `beacon_id` bigint(20) unsigned NOT NULL,
  `zone_id` bigint(20) unsigned NOT NULL,
  `timestamp` datetime NOT NULL,
  `google_place_id` bigint(20) unsigned DEFAULT NULL,
  `name` varchar(60) DEFAULT NULL,
  `lat` decimal(12,7) DEFAULT NULL,
  `lng` decimal(12,7) DEFAULT NULL,
  `personas_count` int(10) unsigned DEFAULT '0',
  `created` datetime DEFAULT NULL,
  `modified` datetime DEFAULT NULL,
  PRIMARY KEY (`id`,`timestamp`),
  KEY `zone_idx` (`zone_id`),
  KEY `device_id_idx2` (`device_id`),
  KEY `beacon_id_idx` (`beacon_id`),
  KEY `timestamp_idx` (`id`,`timestamp`),
  KEY `ALL` (`id`,`timestamp`,`name`,`lat`,`lng`,`personas_count`,`modified`),
  CONSTRAINT `beacon_id` FOREIGN KEY (`beacon_id`) REFERENCES `beacons` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
  CONSTRAINT `device2` FOREIGN KEY (`device_id`) REFERENCES `devices` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
  CONSTRAINT `zone_FK` FOREIGN KEY (`zone_id`) REFERENCES `zones` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=303907 DEFAULT CHARSET=utf8;

设备

CREATE TABLE `devices` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `device_id` bigint(20) unsigned NOT NULL,
  `advertiser_id` char(36) NOT NULL,
  `os` varchar(80) DEFAULT NULL,
  `bluetooth_enabled` tinyint(1) DEFAULT NULL,
  `impressions_count` int(10) unsigned DEFAULT '0',
  `created` datetime DEFAULT NULL,
  `modified` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `advertiser_idx` (`advertiser_id`),
  KEY `ad_dev` (`device_id`,`advertiser_id`),
  KEY `device_id` (`device_id`)
) ENGINE=InnoDB AUTO_INCREMENT=53628 DEFAULT CHARSET=utf8;

踢球者是:

当我在 FROM 印象之后使用 FORCE INDEX (timestamp_idx) 时,效果很好。它使用该索引,并在大约 0.078 秒内运行。我不知道它为什么要尝试避免使用该索引,或者首先从该表中进行选择。

更新

用 FORCE INDEX 包括 EXPLAIN

Current database: gen1_d2go

+----+-------------+-------------+--------+------------------------------+---------------+---------+---------------------------------+------+-------+
| id | select_type | table       | type   | possible_keys                | key           | key_len | ref                             | rows | Extra |
+----+-------------+-------------+--------+------------------------------+---------------+---------+---------------------------------+------+-------+
|  1 | SIMPLE      | Impressions | index  | NULL                         | timestamp_idx | 5       | NULL                            |   15 | NULL  |
|  1 | SIMPLE      | Zones       | eq_ref | PRIMARY,fk_location_idx,comp | PRIMARY       | 8       | gen1_d2go.Impressions.zone_id   |    1 | NULL  |
|  1 | SIMPLE      | Beacons     | eq_ref | PRIMARY                      | PRIMARY       | 8       | gen1_d2go.Impressions.beacon_id |    1 | NULL  |
|  1 | SIMPLE      | Locations   | eq_ref | PRIMARY                      | PRIMARY       | 8       | gen1_d2go.Zones.location_id     |    1 | NULL  |
|  1 | SIMPLE      | Regions     | eq_ref | PRIMARY                      | PRIMARY       | 4       | gen1_d2go.Locations.region_id   |    1 | NULL  |
|  1 | SIMPLE      | Countries   | eq_ref | PRIMARY                      | PRIMARY       | 4       | gen1_d2go.Locations.country_id  |    1 | NULL  |
|  1 | SIMPLE      | Devices     | eq_ref | PRIMARY                      | PRIMARY       | 8       | gen1_d2go.Impressions.device_id |    1 | NULL  |
+----+-------------+-------------+--------+------------------------------+---------------+---------+---------------------------------+------+-------+
7 rows in set (0.01 sec)

【问题讨论】:

  • 我很确定我会看到什么 - 但请您将执行计划发布到 FORCE INDEX 吗?
  • @PaulSpiegel 用新的解释更新了文档
  • 对不起,我没有答案。我只是想了解更多信息,所以有人可能会提供帮助。我还遇到了愚蠢的执行计划的问题,不得不使用 STRIGHT_JOIN 或 FORCE INDEX 来修复它们。所以我对答案很感兴趣。
  • straight_join 也可以。我就是想知道mysql为什么不合作

标签: mysql join indexing inner-join


【解决方案1】:

Impressions 需要一个以timestamp 开头的索引。这样,优化器将有望决定以timestamp 的顺序扫描Impressions,从而避免排序等。

附加课程...您有 3 个以 id, timestamp 开头的索引。一个是PRIMARY KEY。这意味着其他两个是不必要的。

因此可以获得额外的加速:

ALTER TABLE Impressions
    DROP INDEX timestamp_idx,  -- as already mentioned
    DROP INDEX ALL,            -- ditto
    DROP PRIMARY KEY,          -- to rearrange it
    ADD PRIMARY KEY(timestamp, id),  -- thus
    ADD INDEX(id);             -- and keep AUTO_INCREMENT happy

为什么?通过使 PK start 带有时间戳,查询可以扫描数据,而不是在某个索引和数据之间跳跃。这将加快有问题的查询。警告:它可能会伤害其他查询。

其他注意事项...

CHAR(36) 闻起来像 UUID,对吗?但是使用 utf8,它正好需要 108 个字节!更改为 CHAR(36) CHARACTER SET ascii NOT NULL,这样它就只需要 36 个字节。 (或者您可以转换为 `BINARY(16) 以节省更多费用;但这是另一回事,需要更多代码。)

除非您有数十亿行,否则BIGINT(8 字节)对于 id 来说是多余的。 INT UNSIGNED 只有 4 个字节。

以各种方式更小意味着更快。

【讨论】:

  • 这是有道理的 - 我没有看到 timestamp_idxid 开头,因此与 PK 和 ALL 索引是多余的。但我仍然不明白,为什么使用该索引会导致性能提升。我认为 MySQL 仍然必须将完整的索引读入临时表,然后对其进行排序,然后才能决定选择哪 15 行。但是执行计划中没有这样的东西。
  • 想想一长串人,按姓氏排序,然后是名字。但你只有这个人的名字。
【解决方案2】:

关于全表扫描,您无法对其进行进一步优化。 请参阅此处了解更多信息 https://cryptkcoding.com/blog/2012/04/06/how-to-optimize-mysql-join-queries-through-indexing/

【讨论】:

    猜你喜欢
    • 2016-11-24
    • 2021-12-08
    • 2021-08-29
    • 1970-01-01
    • 2012-02-02
    • 2021-10-15
    • 1970-01-01
    • 2017-10-22
    • 2015-02-25
    相关资源
    最近更新 更多