【发布时间】:2020-11-17 07:58:34
【问题描述】:
我有一个非常慢的查询,其中第一部分由 gem 创建(https://github.com/CanCanCommunity/cancancan,它创建选择和内部查询),我在其中添加了 ORDER BY 和 LIMIT 用于基于游标的分页。
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.48 sec)
这是表格:
CREATE TABLE `spree_products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`available_on` datetime DEFAULT NULL,
`permalink` varchar(255) DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`count_on_hand` int(11) DEFAULT NULL,
`vendor_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_spree_products_on_vendor_id` (`vendor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=37209248 DEFAULT CHARSET=utf8mb4
CREATE TABLE `spree_vendors` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`active` tinyint(1) DEFAULT '0',
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4413 DEFAULT CHARSET=utf8mb4
(我删除了不必要的字段以保持整洁)
上面查询中的EXPLAIN 返回:
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | spree_vendors | NULL | ALL | PRIMARY | NULL | NULL | NULL | 3465 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | spree_products | NULL | ref | PRIMARY,index_spree_products_on_vendor_id | index_spree_products_on_vendor_id | 5 | _hubert_test.spree_vendors.id | 8613 | 100.00 | Using index |
| 1 | SIMPLE | spree_products | NULL | eq_ref | PRIMARY | PRIMARY | 4 | _hubert_test.spree_products.id | 1 | 100.00 | NULL |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
当我删除 ORDER BY 时,查询速度很快:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
LIMIT 50;
=> 50 rows in set (0.00 sec)
当我从外部查询中保留ORDER BY 部分,但从子查询中删除WHERE 部分时,查询也很快:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
我尝试将复合索引添加到 spree_vendors.id / spree_vendors.active,但这没有帮助。
关于如何优化此查询的任何想法?
更新 1:
JOIN 的变体也很慢。 DISTINCT 由 gem 添加以防止重复记录,以防您未选择所有列:
SELECT DISTINCT `spree_products`.*
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 43.13 sec)
没有DISTINCT,查询速度很快。
更新 2
有人指出,在子查询中使用LEFT OUTER JOIN 会返回整个表。但是当使用INNER JOIN 时它仍然很慢:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
INNER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.98 sec)
【问题讨论】:
-
问题出在 WHERE IN,而不是 ORDER BY。重写为 INNER JOIN 或至少 WHERE EXISTS。如果您的框架不能使用原始 SQL。
-
附言。通过文本逻辑,您的 WHERE IN 和整个子查询必须完全删除 - 它只是检查
spree_products.id不为 NULL。 -
...同'inner join'
-
请注意,没有 ORDER BY 的 LIMIT 是毫无意义的
-
我已经更新了问题并添加了它的 JOIN 版本。 @Akina“它只是检查......”是什么意思?子查询选择所有
spree_products具有spree_vendor和active = TRUE,或者我错过了什么?
标签: mysql ruby-on-rails database cancancan