【问题标题】:Convert NOT IN query to better performance将 NOT IN 查询转换为更好的性能
【发布时间】:2014-01-31 09:35:12
【问题描述】:

我使用的是 MySQL 5.0,我需要微调这个查询。谁能告诉我在这方面我能做些什么调整?

SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details 
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header 
WHERE sara_master_id IN 
(SELECT alert_sara_master_id FROM alert_sara_lines 
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;

【问题讨论】:

  • 对不起,我不知道如何在这里格式化。和 m 急需。

标签: mysql sql notin sql-tuning


【解决方案1】:

我要做的第一件事是rewrite the subqueries as joins

SELECT      h.alert_master_id

FROM        alert_appln_header h

       JOIN schedule_config c
         ON c.schedule_name = 'Purging_Config'

  LEFT JOIN alert_details d
         ON d.alert_master_id = h.alert_master_id
        AND d.end_date IS NULL
        AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY

  LEFT JOIN (
              alert_sara_header s
         JOIN alert_sara_lines  l
           ON l.alert_sara_master_id = s.sara_master_id
            )
         ON s.alert_master_id = h.alert_master_id
        AND s.end_date IS NULL
        AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY

WHERE       h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
        AND d.alert_master_id IS NULL
        AND s.alert_master_id IS NULL

GROUP BY    h.alert_master_id

LIMIT       5000

如果之后仍然很慢,请重新检查您的索引策略。我建议索引:

  • alert_appln_header(alert_master_id,created_date)
  • schedule_config(schedule_name)
  • alert_details(alert_master_id,end_date,created_date)
  • alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
  • alert_sara_lines(alert_sara_master_id)

【讨论】:

    【解决方案2】:

    好的,这可能只是在黑暗中的一个镜头,但我认为你不需要那么多DISTINCT

    SELECT DISTINCT(alert_master_id) FROM alert_appln_header 
    WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
    AND alert_master_id NOT IN (
         -- removed distinct here --
        SELECT alert_master_id FROM alert_details 
        WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY) 
        UNION
         -- removed distinct here --
        SELECT alert_master_id FROM alert_sara_header 
        WHERE sara_master_id IN 
            (SELECT alert_sara_master_id FROM alert_sara_lines 
            WHERE end_date IS NULL) 
        AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
    ) LIMIT 5000;
    

    由于使用DISTINCT 非常昂贵,请尽量避免使用它。在第一个 WHERE 子句中,您正在检查某些 result 中的 NOT ids,因此在该 result 中是否有一些 @987654327 无关紧要@ 出现不止一次。

    【讨论】:

    • 谢谢先生,第一个不同是我的错误,但我做了两个以减少子查询的大小并使 IN 运算符更快,我不确定我是否正确。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-10-18
    • 2020-03-14
    • 2020-11-05
    相关资源
    最近更新 更多