【问题标题】:SQL JOIN between A, B and C mixing full and left joinA、B 和 C 之间的 SQL JOIN 混合全连接和左连接
【发布时间】:2018-11-16 06:09:26
【问题描述】:

我想知道在下面的示例中是否有更好的方法来创建我的左连接:

SELECT TOP 10 COALESCE(A.COD_PRODUCT, B.COD_PRODUCT),
              COALESCE(A.COD_FAMILY, B.COD_FAMILY),
              COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT),
              A.MASS
              B.VOLUME
              C.PRICE
FROM FIRSTTABLE A FULL JOIN SECONDTABLE B ON B.COD_PRODUCT = A.COD_PRODUCT
                                          AND B.COD_FAMILY = A.COD_FAMILY
                                          AND B.DATE_EXTRACT = A.DATE_EXTRACT
LEFT JOIN THIRDTABLE C ON C.COD_PRODUCT = COALESCE(A.COD_PRODUCT,B.COD_PRODUCT)
                      AND C.COD_FAMILY = COALESCE(A.COD_FAMILY, B.COD_FAMILY)
                      AND C.DATE_EXTRACT = COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT)

这种接缝需要很长时间,我怀疑它非常昂贵且可以改进

编辑:我想在视图中改进这个SELECT FROM JOIN 语句。

【问题讨论】:

  • 您已经在 SELECT 子句中合并。我想认为 SQL Server 正在对此进行优化,以便它只执行 COALESCE() 一次,并没有真正增加执行第三次连接的时间,除了必须首先执行它。也许将除第三个连接之外的所有内容都放入子查询中,然后在子查询之外连接(在 SELECT COALESCE() 发生之后)以确保 SQL 服务器只执行一次昂贵的操作。不过,这可能又是一次洗涤。
  • SELECT 语句中的 COALESCE 似乎是不必要的,因为您要加入这些相同的字段。因为当这些字段匹配时您正在加入,所以不会处理任何 NULL 字段,除非两个表都有 NULL。例如,如果B.COD_PRODUCT = A.COD_PRODUCT 等于NULL = NULL,那么将有一行 COD_PRODUCT 为 NULL。但是COALESCE (A.COD_PRODUCT, B.COD_PRODUCT) 仍将返回NULL
  • 当我不在 SELECT 语句中使用 coalesce 时,它​​会发生一些变化。使用合并时,有 942028 行,其中 3 个第一列都不为空。如果我不使用coalesce,使用表A字段时为640142行,使用表B字段时为490522行。所以我会保持合并。
  • 不清楚,你要选择Top 10 of which tableTop 10 on order by of what column。这是主要标准。

标签: sql sql-server join optimization


【解决方案1】:

连接中的合并或 OR 会大大降低查询速度。根据表的大小,更好的答案可能是加入表 c 两次。一次在表 A 上,又一次在表 B 上,然后在您的 select 子句中合并。

选择 TOP 10 COALESCE(A.COD_PRODUCT, B.COD_PRODUCT,c.COD_PRODUCT,ca.COD_PRODUCT),
COALESCE(A.COD_FAMILY, B.COD_FAMILY,c.COD_FAMILY,ca.COD_FAMILY),
COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT,c.DATE_EXTRACT,ca.DATE_EXTRACT),
A.MASS
B.音量
C.价格
来自第一表A
FULL JOIN SECONDTABLE B
ON B.COD_PRODUCT = A.COD_PRODUCT
AND B.COD_FAMILY = A.COD_FAMILY
AND B.DATE_EXTRACT = A.DATE_EXTRACT
左加入第三表 C
ON C.COD_PRODUCT = A.COD_PRODUCT
AND C.COD_FAMILY = A.COD_FAMILY
AND C.DATE_EXTRACT = A.DATE_EXTRACT
左加入第三表 Ca
在 Ca.COD_PRODUCT = b.COD_PRODUCT
和 Ca.COD_FAMILY = b.COD_FAMILY
AND Ca.DATE_EXTRACT = b.DATE_EXTRACT

【讨论】:

  • 帽子男怎么了 ;~)
【解决方案2】:

您可以将查询一分为二:收集所有匹配 FIRTTABLE 的数据。然后将它与所有匹配 SECONDTABLE 且不在 FIRTTABLE 中的数据合并。

这应该允许 SQL Server 更好地使用这些表上的索引。

SELECT A.COD_PRODUCT,
       A.COD_FAMILY,
       A.DATE_EXTRACT,
       A.MASS,
       B.VOLUME,
       C.PRICE
FROM FIRSTTABLE A
LEFT OUTER JOIN SECONDTABLE B
     ON B.COD_PRODUCT = A.COD_PRODUCT
     AND B.COD_FAMILY = A.COD_FAMILY
     AND B.DATE_EXTRACT = A.DATE_EXTRACT
LEFT OUTER JOIN THIRDTABLE C
     ON C.COD_PRODUCT = A.COD_PRODUCT
     AND C.COD_FAMILY = A.COD_FAMILY
     AND C.DATE_EXTRACT = A.DATE_EXTRACT
UNION ALL
SELECT B.COD_PRODUCT,
       B.COD_FAMILY,
       B.DATE_EXTRACT,
       NULL AS MASS,
       B.VOLUME,
       C.PRICE
FROM SECONDTABLE B
LEFT OUTER JOIN THIRDTABLE C
     ON C.COD_PRODUCT = B.COD_PRODUCT
     AND C.COD_FAMILY = B.COD_FAMILY
     AND C.DATE_EXTRACT = B.DATE_EXTRACT
WHERE NOT EXISTS (SELECT 1
                  FROM   FIRSTTABLE A
                  WHERE  A.COD_PRODUCT = B.COD_PRODUCT
                  AND    A.COD_FAMILY = B.COD_FAMILY
                  AND    A.DATE_EXTRACT = B.DATE_EXTRACT)

【讨论】:

  • 这个解决方案很简单,并且消除了 COALESCE 和可能的临时表。应该被接受为正确答案。
【解决方案3】:

你可以试试这个:

--- isolate the full join data from a and b into a temp table
SELECT 
    COD_PRODUCT=    COALESCE(A.COD_PRODUCT, B.COD_PRODUCT),
    COD_FAMILY=     COALESCE(A.COD_FAMILY, B.COD_FAMILY),
    DATE_EXTRACT=   COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT),
    MASS=           A.MASS,
    VOLUME=         B.VOLUME
INTO #TEMP
FROM FIRSTTABLE A
FULL JOIN SECONDTABLE B
     ON B.COD_PRODUCT = A.COD_PRODUCT
     AND B.COD_FAMILY = A.COD_FAMILY
     AND B.DATE_EXTRACT = A.DATE_EXTRACT
-- add index clustered onto table (covering index)
CREATE CLUSTERED INDEX ix_tempCIndex ON #Temp ([COD_PRODUCT],[COD_FAMILY],[DATE_EXTRACT],[MASS],[VOLUME]);

-- left join C to this temp table
SELECT TOP 10
    T.*, C.PRICE
FROM #TEMP T
LEFT JOIN THIRDTABLE C
     ON C.COD_PRODUCT = T.COD_PRODUCT
     AND C.COD_FAMILY = T.COD_FAMILY
     AND C.DATE_EXTRACT = T.DATE_EXTRACT
-- drop temp table
DROP TABLE #TEMP

【讨论】:

    【解决方案4】:

    您可以使用 UNION 或 UNION ALL 获得相同的结果

    WITH cte AS (
    
        SELECT  
            A.COD_PRODUCT,
            A.COD_FAMILY,
            A.DATE_EXTRACT,
            A.MASS,
            NULL as VOLUME
        FROM
            FIRSTTABLE A
        UNION
        SELECT  
            B.COD_PRODUCT,
            B.COD_FAMILY,
            B.DATE_EXTRACT,
            NULL,
            B.VOLUME
        FROM
            FIRSTTABLE A
    )
    SELECT  
        *
    FROM
        cte AB
    LEFT JOIN 
        THIRDTABLE C    ON C.COD_PRODUCT = AB.COD_PRODUCT
            AND C.COD_FAMILY = AB.COD_FAMILY
            AND C.DATE_EXTRACT = AB.DATE_EXTRACT
    

    如果您可以拥有产品、系列、提取组合的质量和体积,您可以使用聚合将 A 和 B 连接在一起

    WITH cte AS (
        SELECT 
            COD_PRODUCT,
            COD_FAMILY,
            DATE_EXTRACT,
            MAX(MASS) MASS,
            MAX(VOLUME) VOLUME
        FROM (
            SELECT  
                A.COD_PRODUCT,
                A.COD_FAMILY,
                A.DATE_EXTRACT,
                A.MASS,
                NULL as VOLUME
            FROM
                FIRSTTABLE A
            UNION ALL
            SELECT  
                B.COD_PRODUCT,
                B.COD_FAMILY,
                B.DATE_EXTRACT,
                NULL,
                B.VOLUME
            FROM
                FIRSTTABLE A
            ) T
        GROUP BY
            COD_PRODUCT,
            COD_FAMILY,
            DATE_EXTRACT
    )
    SELECT  
        *
    FROM
        cte AB
    LEFT JOIN 
        THIRDTABLE C    ON C.COD_PRODUCT = AB.COD_PRODUCT
            AND C.COD_FAMILY = AB.COD_FAMILY
            AND C.DATE_EXTRACT = AB.DATE_EXTRACT
    

    【讨论】:

    • 匹配在结果中将不再有利于 FIRSTTABLE,因此在 FIRSTTABLE/SECONDTABLE 匹配中结果可能会有所不同。此外,匹配时将返回两行而不是一。
    【解决方案5】:

    将执行计划与此替换连接进行比较:

    LEFT JOIN THIRDTABLE C
         ON (C.COD_PRODUCT = A.COD_PRODUCT or C.COD_PRODUCT = B.COD_PRODUCT)
         AND (C.COD_FAMILY = A.COD_FAMILY or C.COD_FAMILY = B.COD_FAMILY)
         AND (C.DATE_EXTRACT =A.DATE_EXTRACT or C.DATE_EXTRACT = B.DATE_EXTRACT)
    

    【讨论】:

    • 在关节中使用合并的成本为 25.64,而在关节中使用“或”语句的估计成本为 335 119。我想我会继续我的第一次尝试。一个巨大的循环有 68% 的成本
    猜你喜欢
    • 2014-09-23
    • 1970-01-01
    • 2011-06-10
    • 2017-12-24
    • 2015-02-14
    • 1970-01-01
    • 1970-01-01
    • 2013-02-25
    相关资源
    最近更新 更多