【问题标题】:How to handle multiple joins如何处理多个连接
【发布时间】:2016-06-18 00:19:06
【问题描述】:

我有一个复杂的查询,它需要来自总共 4 个表的字段。内部连接导致查询花费的时间比它应该的长得多。我运行了一个 EXPLAIN 语句,其视觉结果附在下面:

这是我的查询:

SELECT 
   pending_corrections.corrected_plate , pending_corrections.seenDate
FROM
    (pending_corrections
    INNER JOIN cameras ON pending_corrections.camerauid = cameras.camera_id)
        INNER JOIN
    vehicle_vrn ON (pending_corrections.corrected_plate = vehicle_vrn.vrn500
        OR pending_corrections.corrected_plate = vehicle_vrn.vrnno)
        INNER JOIN
    vehicle_ownership ON vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno
WHERE
    pending_corrections.seenDate >= '2015-01-01 00:00:00'
        AND pending_corrections.seenDate <= '2015-01-31 23:59:59'
ORDER BY pending_corrections.corrected_plate , pending_corrections.seenDate ASC;

我怎样才能达到相同的效果,但在其中一个联接中没有 OR

【问题讨论】:

  • 这个加入条件看起来不太好:(pending_corrections.corrected_plate = vehicle_vrn.vrn500 OR pending_corrections.corrected_plate = vehicle_vrn.vrnno) - 就性能而言...您是否尝试使用 EXISTS(. ..) 子查询而不是 INNER JOINS,因为您没有从连接表中选择任何数据?
  • ORed 连接条件通常不好(防止使用索引),常见的解决方法是使用UNION 重写。
  • @dnoeth 你有什么建议可以用 UNION 代替吗?
  • @Dot NET: 1) 在寻求查询帮助时,请包括您的架构定义。 2) 全表扫描的性能并不总是比索引查找差 3) 为了猜测是否用索引查找替换全表扫描,您需要查看 all 的基数i> 谓词 - 再次从您的问题中丢失
  • @Dot NET:解释计划的简洁视觉表示 - 你用什么来得到它?

标签: mysql sql database performance relational-database


【解决方案1】:

重写为UNION 很简单,复制源代码并删除每个ORed 条件之一:

SELECT 
   pending_corrections.corrected_plate , pending_corrections.seenDate
FROM
    (pending_corrections
    INNER JOIN cameras ON pending_corrections.camerauid = cameras.camera_id)
        INNER JOIN
    vehicle_vrn ON (pending_corrections.corrected_plate = vehicle_vrn.vrn500)
        INNER JOIN
    vehicle_ownership ON vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno
WHERE
    pending_corrections.seenDate >= '2015-01-01 00:00:00'
        AND pending_corrections.seenDate <= '2015-01-31 23:59:59'

union 

SELECT 
   pending_corrections.corrected_plate , pending_corrections.seenDate
FROM
    (pending_corrections
    INNER JOIN cameras ON pending_corrections.camerauid = cameras.camera_id)
        INNER JOIN
    vehicle_vrn ON pending_corrections.corrected_plate = vehicle_vrn.vrnno)
        INNER JOIN
    vehicle_ownership ON vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno
WHERE
    pending_corrections.seenDate >= '2015-01-01 00:00:00'
        AND pending_corrections.seenDate <= '2015-01-31 23:59:59'

ORDER BY 1,2;

pending_corrections.seenDate 上有索引吗?

【讨论】:

  • 是的,有一个索引。我已经尝试过这个语句,但是仍然在执行两个表扫描。
  • 这是带有上述查询的 EXPLAIN 语句。该查询需要很长时间才能执行,因此似乎有问题。 imgur.com/AF6jlIG
  • @DotNET:返回了多少行? pending_corrections 中有多少行,WHERE 条件有多少行?
  • 对于 WHERE 条件,大约有 200k 行,总共大约有 200 万行
【解决方案2】:

您可以尝试以下方法:

select 
   pending_corrections.corrected_plate , pending_corrections.seenDate
from pending_corrections
where pending_corrections.seenDate >= '2015-01-01 00:00:00'
  and pending_corrections.seenDate <= '2015-01-31 23:59:59'
  and exists(select 1 from cameras where pending_corrections.camerauid = cameras.camera_id)
  and exists(select 1 from vehicle_ownership where vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno)
  and exists(select 1 from vehicle_vrn 
             where pending_corrections.corrected_plate in (vehicle_vrn.vrnno, vehicle_vrn.vrn500))
order by 1,2;

或者正如已经提到的那样:

select * from (
select 
   pending_corrections.corrected_plate , pending_corrections.seenDate
from pending_corrections
where pending_corrections.seenDate >= '2015-01-01 00:00:00'
  and pending_corrections.seenDate <= '2015-01-31 23:59:59'
  and exists(select 1 from cameras where pending_corrections.camerauid = cameras.camera_id)
  and exists(select 1 from vehicle_ownership where vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno)
  and exists(select 1 from vehicle_vrn where pending_corrections.corrected_plate = vehicle_vrn.vrnno)
union
select 
   pending_corrections.corrected_plate , pending_corrections.seenDate
from pending_corrections
where pending_corrections.seenDate >= '2015-01-01 00:00:00'
  and pending_corrections.seenDate <= '2015-01-31 23:59:59'
  and exists(select 1 from cameras where pending_corrections.camerauid = cameras.camera_id)
  and exists(select 1 from vehicle_ownership where vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno)
  and exists(select 1 from vehicle_vrn where pending_corrections.corrected_plate = vehicle_vrn.vrn500)
)  by 1,2;

PS 当然,如果没有数据和知道您的所有索引,我无法自己测试它

【讨论】:

  • 我正在尝试您建议的第一个查询,但收到以下错误:“where 子句中的未知列 'vehicle_vrn.fk_sysno'”。据我所知,这是在查询中定义的,对吧?
  • 只是一个小评论,order by 序号位置已被弃用(在 90 年代从 ANSI SQL 标准中删除。)改用列别名。
  • fk_sysno -- 需要真正加入到 vrn,而不仅仅是在 EXISTS 中使用
【解决方案3】:
      ( SELECT  pc.corrected_plate , pc.seenDate
            FROM  pending_corrections AS pc
            INNER JOIN  cameras AS c ON pc.camerauid = c.camera_id
            INNER JOIN  vehicle_vrn AS v ON pc.corrected_plate = v.vrn500
            INNER JOIN  vehicle_ownership AS vo ON v.fk_sysno = vo.fk_sysno
            WHERE  pc.seenDate >= '2015-01-01'
              AND  pc.seenDate  < '2015-01-01' + INTERVAL 1 MONTH  -- note improved pattern
      )
    UNION  ALL   -- or use DISTINCT if you could have dups
      ( SELECT  pc.corrected_plate , pc.seenDate
            FROM  pending_corrections AS pc
            INNER JOIN  cameras AS c ON pc.camerauid = c.camera_id
            INNER JOIN  vehicle_vrn AS v ON pc.corrected_plate = v.vrnno
            INNER JOIN  vehicle_ownership AS vo ON v.fk_sysno = vo.fk_sysno
            WHERE  pc.seenDate >= '2015-01-01'
              AND  pc.seenDate  < '2015-01-01' + INTERVAL 1 MONTH 
      )
    ORDER BY  corrected_plate , seenDate; 

你需要

pc: INDEX(seenDate)  -- which you said you have
c:  INDEX(camera_id) -- unless you have PRIMARY KEY(camera_id)
v:  INDEX(vrn500)
v:  INDEX(vrnno)
vo: INDEX(fk_sysno) -- sounds like it already exists

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2012-12-08
    • 1970-01-01
    • 1970-01-01
    • 2016-02-18
    • 1970-01-01
    • 1970-01-01
    • 2011-07-17
    • 2019-02-27
    相关资源
    最近更新 更多