【问题标题】:Complex MySQL query still using filesort although indexes exist尽管存在索引,但复杂的 MySQL 查询仍使用文件排序
【发布时间】:2012-03-01 16:49:32
【问题描述】:

我有一个包含数千行内容(大约 300 万行)的 Joomla 表。在查询表时,我在重写数据库查询时遇到了一些麻烦。

这是我的完整查询:

SELECT cc.title AS category, a.id, a.title, a.alias, a.title_alias, a.introtext, a.fulltext, a.sectionid, a.state, a.catid, a.created, a.created_by, a.created_by_alias, a.modified, a.modified_by, a.checked_out, a.checked_out_time, a.publish_up, a.publish_down, a.attribs, a.hits, a.images, a.urls, a.ordering, a.metakey, a.metadesc, a.access, CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug, CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore, u.name AS author, u.usertype, g.name AS groups, u.email AS author_email
FROM j15_content AS a
LEFT JOIN j15_categories AS cc
ON a.catid = cc.id
LEFT JOIN j15_users AS u
ON u.id = a.created_by
LEFT JOIN j15_groups AS g
ON a.access = g.id
WHERE 1
AND a.access <= 0
AND a.catid = 108
AND a.state = 1
AND ( publish_up = '0000-00-00 00:00:00' OR publish_up <= '2012-02-08 00:16:26' )
AND ( publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-02-08 00:16:26' )
ORDER BY a.title, a.created DESC
LIMIT 0, 10

这是来自 EXPLAIN 的输出:

 +----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+
| id | select_type | table | type   | possible_keys                                         | key       | key_len | ref                       | rows    | Extra                       |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+
|  1 | SIMPLE      | a     | ref    | idx_access,idx_state,idx_catid,idx_access_state_catid | idx_catid | 4       | const                     | 3108187 | Using where; Using filesort |
|  1 | SIMPLE      | cc    | const  | PRIMARY                                               | PRIMARY   | 4       | const                     |       1 |                             |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                                               | PRIMARY   | 4       | database.a.created_by     |       1 |                             |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                               | PRIMARY   | 1       | database.a.access         |       1 |                             |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+

为了显示存在哪些索引,SHOW INDEX FROM j15_content:

+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table       | Non_unique | Key_name               | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| j15_content |          0 | PRIMARY                |            1 | id          | A         |     3228356 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_section            |            1 | sectionid   | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access             |            1 | access      | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_checkout           |            1 | checked_out | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_state              |            1 | state       | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_catid              |            1 | catid       | A         |           6 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_createdby          |            1 | created_by  | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | title                  |            1 | title       | A         |      201772 |        4 | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            1 | access      | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            2 | state       | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            3 | catid       | A         |           7 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_title_created      |            1 | title       | A         |     3228356 |        8 | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_title_created      |            2 | created     | A         |     3228356 |     NULL | NULL   |      | BTREE      |         |
+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

如您所见,从数据库中获取了一些数据。现在我通过简化查询来测试真正的问题在于 ORDER BY 子句。没有对结果进行排序,查询是相当响应的,这里是一个解释:

+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+
| id | select_type | table | type   | possible_keys                                         | key       | key_len | ref                       | rows    | Extra       |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+
|  1 | SIMPLE      | a     | ref    | idx_access,idx_state,idx_catid,idx_access_state_catid | idx_catid | 4       | const                     | 3108187 | Using where |
|  1 | SIMPLE      | cc    | const  | PRIMARY                                               | PRIMARY   | 4       | const                     |       1 |             |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                                               | PRIMARY   | 4       | database.a.created_by     |       1 |             |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                               | PRIMARY   | 1       | database.a.access         |       1 |             |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+

如您所见,这是致命的文件排序正在杀死服务器。有了这么多行,我正在尽最大努力通过索引优化所有内容,但仍然有一些问题。任何意见将不胜感激。

尝试使用 FORCE INDEX 无济于事:

explain     SELECT cc.title AS category, a.id, a.title, a.alias, a.title_alias, a.introtext, a.fulltext, a.sectionid, a.state, a.catid, a.created, a.created_by, a.created_by_alias, a.modified, a.modified_by, a.checked_out, a.checked_out_time, a.publish_up, a.publish_down, a.attribs, a.hits, a.images, a.urls, a.ordering, a.metakey, a.metadesc, a.access, CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug, CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore, u.name AS author, u.usertype, g.name AS groups, u.email AS author_email
    ->     FROM bak_content AS a
    ->     FORCE INDEX (idx_title_created)
    ->     LEFT JOIN bak_categories AS cc
    ->     ON a.catid = cc.id
    ->     LEFT JOIN bak_users AS u
    ->     ON u.id = a.created_by
    ->     LEFT JOIN bak_groups AS g
    ->     ON a.access = g.id
    ->     WHERE 1
    ->     AND a.access <= 0
    ->     AND a.catid = 108
    ->     AND a.state = 1
    ->     AND ( publish_up = '0000-00-00 00:00:00' OR publish_up <= '2012-02-08
    ->     AND ( publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-0
    ->     ORDER BY a.title, a.created DESC
    ->     LIMIT 0, 10;

生产:

+----+-------------+-------+--------+---------------+---------+---------+-------
| id | select_type | table | type   | possible_keys | key     | key_len | ref
+----+-------------+-------+--------+---------------+---------+---------+-------
|  1 | SIMPLE      | a     | ALL    | NULL          | NULL    | NULL    | NULL
|  1 | SIMPLE      | cc    | const  | PRIMARY       | PRIMARY | 4       | const
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | database
|  1 | SIMPLE      | g     | eq_ref | PRIMARY       | PRIMARY | 1       | database
+----+-------------+-------+--------+---------------+---------+---------+-------

【问题讨论】:

  • 我会尝试以下索引之一:(state, catid, access)(state, catid, publish_up)(state, catid, publish_down)
  • 如果不使用LIMIT会返回多少行?
  • 您也可以尝试强制使用idx_title_created 索引。
  • 谢谢 ypercube 我会尽快尝试。 publish_up/down 几乎无关紧要,我很可能会从最终查询中删除这些,状态、catid、访问是最重要的。如果没有 LIMIT,这一类别中大约有 200 万篇文章。我尝试强制索引无济于事,我会将结果附加到实际的问题帖子中。谢谢大家。
  • access 列可以有哪些值?

标签: mysql optimization indexing filesort


【解决方案1】:

AFAIK 无法使用索引、提示或查询本身的重组来合理地解决此问题。

这很慢的原因是它需要 2M 行的文件排序,这实际上需要很长时间。如果您放大订单,则指定为ORDER BY a.title, a.created DESC。问题是对多于 1 列的排序和具有 DESC 部分的组合。 Mysql不支持降序索引(CREATE INDEX statement支持关键字DESC但仅供以后使用)。

建议的解决方法是创建一个额外的列“reverse_created”,以使您的查询可以使用ORDER BY a.title, a.reverse_created 的方式自动填充该列。所以你用max_time - created_time 填充它。然后在该组合上创建一个索引,并(如果需要)将该索引指定为提示。

有几篇关于这个主题的非常好的博客文章可以更好地解释这一点,并附有示例:

-更新-您应该能够通过从查询中的 order by 中删除“DESC”部分来对此进行快速测试。结果在功能上将是错误的,但它应该使用您拥有的现有索引(否则强制应该起作用)。

【讨论】:

    【解决方案2】:

    有时 MySQL 无法找到正确的索引。您可以通过提示正确的索引来解决此问题。

    提示语法: http://dev.mysql.com/doc/refman/4.1/en/index-hints.html

    确保您拥有正确的索引并通过试验调整其性能。

    干杯!

    【讨论】:

    • 我已经尝试过强制索引,但这并不能解决问题。似乎我可能索引了错误的列,但我不确定哪些应该/不应该在索引中。我假设由于查询的性质,需要一些复合索引。
    【解决方案3】:

    你能试试这个变化吗:

    SELECT cc.title AS category, ...
    FROM 
        ( SELECT *
          FROM j15_content AS a 
                   USE INDEX (title)             --- with and without the hint
          WHERE 1
            AND a.access <= 0
            AND a.catid = 108
            AND a.state = 1
            AND ( publish_up = '0000-00-00 00:00:00' 
               OR publish_up <= '2012-02-08 00:16:26' )
            AND ( publish_down = '0000-00-00 00:00:00' 
               OR publish_down >= '2012-02-08 00:16:26' )
          ORDER BY a.title, a.created DESC
          LIMIT 0, 10
        ) AS a
      LEFT JOIN j15_categories AS cc
        ON a.catid = cc.id
      LEFT JOIN j15_users AS u
        ON u.id = a.created_by
      LEFT JOIN j15_groups AS g
        ON a.access = g.id
    

    我认为(catid, state, title) 上的索引会更好。

    【讨论】:

    • 谢谢,我会尽快尝试并回复您。索引对 (title, catid, id) 有什么影响(按该顺序),因此结果已经按标题排序。我正在考虑将其拆分为 2 个查询,如果我可以简单地提取文章 ID,然后执行单独的查询,然后使用 where id IN (1、2、3 等)返回所有相关的文章信息
    • 查询仍然返回:“#1028 - 排序中止”。
    【解决方案4】:

    也许尝试这样做可能会有所帮助:

    CREATE INDEX idx_catid_title_created ON j15_content (catid,title(8),created);
    DROP INDEX idx_catid ON j15_content;
    

    【讨论】:

      【解决方案5】:

      您是否尝试过增加这些值 tmp_table_size 和 max_heap_table_size:

      有一个简短的解释here,还链接到每个人的详细信息。

      希望这会有所帮助!

      【讨论】:

        【解决方案6】:

        我希望这在语法上是正确的

        SELECT
            cc.title AS category,
            a.id, a.title, a.alias, a.title_alias,
            a.introtext, a.fulltext, a.sectionid,
            a.state, a.catid, a.created, a.created_by,
            a.created_by_alias, a.modified, a.modified_by,
            a.checked_out, a.checked_out_time,
            a.publish_up, a.publish_down, a.attribs,
            a.hits, a.images, a.urls, a.ordering, a.metakey,
            a.metadesc, a.access,
            CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug,
            CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore,
            u.name AS author, u.usertype, g.name AS groups, u.email AS author_email 
        FROM
        (
            SELECT aa.*
            FROM 
            (
                SELECT id FROM 
                FROM j15_content
                WHERE catid=108 AND state=1
                AND a.access <= 0 
                AND (publish_up   = '0000-00-00 00:00:00' OR   publish_up <= '2012-02-08 00:16:26')
                AND (publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-02-08 00:16:26')
                ORDER BY title,created DESC
                LIMIT 0,10
            ) needed_keys
            LEFT JOIN j15_content aa USING (id)
        ) a
        LEFT JOIN j15_categories AS cc ON a.catid = cc.id 
        LEFT JOIN j15_users AS u ON a.created_by = u.id
        LEFT JOIN j15_groups AS g ON a.access = g.id;
        

        您需要一个支持子查询所需的索引

        ALTER TABLE j15_content ADD INDEX subquery_ndx (catid,state,access,title,created);
        

        试试看!!!

        【讨论】:

          猜你喜欢
          • 2012-11-11
          • 2018-08-04
          • 2017-04-17
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2021-11-15
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多