【问题标题】:Slow performance with recursive SQL (CTE)递归 SQL (CTE) 性能低下
【发布时间】:2017-04-17 15:27:53
【问题描述】:

对不起,如果这是一个有点长的问题,但没有简单的方法来表达它。

我有以下问题

SELECT 
    S.*
FROM 
    Stock S
LEFT JOIN 
    Stock_Category SC ON SC.StockId = S.Id
WHERE 
    S.Published = 1 
    AND (@CategoryId IS NULL OR 
         (SELECT COUNT(*) 
          FROM GetParentCategoriesByCategoryId(SC.CategoryId) 
          WHERE Id = @CategoryId) > 0) 

GetParentCategoriesByCategoryId() 内部,我有以下公用表表达式(CTE):

DECLARE @TableOutput TABLE(Id UNIQUEIDENTIFIER, 
                            PosDissectionId INT,
                            PosFamilyClassId INT,
                            ParentId UNIQUEIDENTIFIER,
                            Code NVARCHAR(25),
                            [Name] NVARCHAR(100),
                            Description NVARCHAR(1000),
                            AzureId UNIQUEIDENTIFIER,
                            Extension NVARCHAR(10),
                            Visible BIT,
                            OrderIndex INT,
                            StockCount INT,
                            Depth INT)
BEGIN
    DECLARE @TotalVisible INT,
            @TotalRows INT

    ;WITH CategoryStructure (Id, ParentId, ParentName, Name, Depth, Visible)
    As 
    ( 
        SELECT 
            C.Id, 
            C.ParentId, 
            CAST('' AS NVARCHAR(500)) AS ParentName, 
            C.Name, 
            0 AS Depth, 
            C.Visible
        FROM 
            Category C
        WHERE 
            Id = @LocalCategoryId

        UNION ALL

        SELECT 
            ParentCategory.Id, 
            ParentCategory.ParentId, 
            CategoryStructure.Name AS ParentName, 
            ParentCategory.Name, 
            CategoryStructure.Depth + 1,
            ParentCategory.Visible
        FROM 
            Category ParentCategory
        INNER JOIN 
            CategoryStructure ON ParentCategory.Id = CategoryStructure.ParentId
    )
    INSERT INTO @TableOutput
        SELECT          
            C.*,
            SC.StockCount,
            CS.Depth
        FROM 
            CategoryStructure CS 
        INNER JOIN 
            Category C ON  C.Id = CS.Id
        LEFT JOIN 
            (SELECT CategoryId, COUNT(*) AS StockCount 
             FROM Stock_Category SC
             INNER JOIN Stock S ON S.Id = SC.StockId
             WHERE S.Published = 1 AND 
                 ((S.WidthMM IS NOT NULL AND 
                   S.HeightMM IS NOT NULL AND 
                   S.DepthMM IS NOT NULL AND
                    S.WeightG IS NOT NULL)) AND
                CategoryId IN(SELECT CategoryId FROM CategoryStructure)
        GROUP BY CategoryId

    ) SC ON SC.CategoryId = CS.Id

    WHERE (@IncludeSelf = 1 OR CS.Id != @CategoryId) 

    SELECT 
        @TotalVisible = SUM(CONVERT(INT, Visible)),
        @TotalRows = COUNT(*) 
    FROM @TableOutput

    IF @TotalVisible <> @TotalRows
        DELETE FROM @TableOutput    

    RETURN
END

我的查询执行计划如下所示。

不幸的是,我对 2000 行的查询时间超过了 7 秒。我相信我已经添加了正确的索引(它似乎表明查询正在使用它们)。

我已经能够将问题缩小到 CTE 中的 LEFT JOIN

   SELECT CategoryId, COUNT(*) AS StockCount 
   FROM Stock_Category SC
   INNER JOIN Stock S ON S.Id = SC.StockId
   WHERE S.Published = 1 AND blah blah blah....

因为当我移除它时,性能会大幅提升,但到目前为止我只能推断出这些。

我并不期待一个解决方案,因为我知道它基于许多因素,但我远非 SQL 专家,希望有人可以就我可能需要寻找的内容提供任何指导?

表的架构可以在这里找到:https://www.dropbox.com/s/tpetq6fky58fhti/schemas.sql?dl=0

【问题讨论】:

  • 你能提供样本数据,涉及的表架构
  • 使用Paste The Plan @ brentozar.com 分享您的执行计划,以下是说明:How to Use Paste the Plan
  • 你可能不想(SELECT COUNT(*) FROM GetParentCategoriesByCategoryId(SC.CategoryId) WHERE Id = @CategoryId) &gt; 0),而是EXISTS (SELECT 1 from GetParentCategoriesByCategoryId(SC.CategoryId) WHERE Id = @CategoryId。此外,如果可以的话,您可能希望摆脱该函数调用。递归 CTE 并不快。
  • 您的多语句表值函数在这里伤害了您,我会将其重写为内联表值函数。 When is a SQL function not a function? "If it’s not inline, it’s rubbish." - Rob Farley
  • 不是为每个单独的类别 id 调用多语句 TVF,您是否可以预先实现整个层次结构以获得所有类别 id 符合您标准的结果?然后使用类似SELECT S.* FROM Stock S LEFT JOIN Stock_Category SC ON SC.StockId = S.Id WHERE S.Published = 1 AND (@CategoryId IS NULL OR SC.CategoryId IN (SELECT CategoryId FROM #Cats)) OPTION (RECOMPILE)

标签: sql sql-server performance tsql common-table-expression


【解决方案1】:

我做了 2 处更改:

1) 使函数内联(HAVING 子句)

2) 将 LEFT JOIN 替换为外部应用。

WITH CategoryStructure (Id, ParentId, ParentName, Name, Depth, Visible)
As 
( 
    SELECT 
        C.Id, 
        C.ParentId, 
        CAST('' AS NVARCHAR(500)) AS ParentName, 
        C.Name, 
        0 AS Depth, 
        C.Visible
    FROM 
        Category C
    WHERE 
        Id = @LocalCategoryId

    UNION ALL

    SELECT 
        ParentCategory.Id, 
        ParentCategory.ParentId, 
        CategoryStructure.Name AS ParentName, 
        ParentCategory.Name, 
        CategoryStructure.Depth + 1,
        ParentCategory.Visible
    FROM 
        Category ParentCategory
    INNER JOIN 
        CategoryStructure ON ParentCategory.Id = CategoryStructure.ParentId
)
INSERT INTO @TableOutput
    SELECT          
        C.*,
        SC.StockCount,
        CS.Depth
    FROM 
        CategoryStructure CS 
    INNER JOIN 
        Category C ON  C.Id = CS.Id
    OUTER APPLY
        (SELECT CategoryId, COUNT(*) AS StockCount 
         FROM Stock_Category SC
         INNER JOIN Stock S ON S.Id = SC.StockId
         WHERE S.Published = 1 AND 
             ((S.WidthMM IS NOT NULL AND 
               S.HeightMM IS NOT NULL AND 
               S.DepthMM IS NOT NULL AND
                S.WeightG IS NOT NULL)) AND
            CategoryId = CS.Id
        ) SC

WHERE (@IncludeSelf = 1 OR CS.Id != @CategoryId) 
HAVING SUM(CONVERT(INT, Visible)) = COUNT(*)

附:第一个查询看起来很奇怪(您有 @CategoryId 参数,但不要通过它搜索。您构建所有可能的树然后过滤)。我认为你的算法有错误,可以写GetParentCategoriesByCategoryId(@CategoryId)吗?

【讨论】:

    【解决方案2】:

    因此,对于任何好奇的人来说,最终解决方案涉及重做我的索引,利用上面 cmets 的一些建议,重要的是删除临时表。

    最后,我设法将查询时间缩短到不到 1 秒,这是目标。 但我不太确定 Group By,想知道是否有更好的方法来做到这一点?还有其他人有什么进一步的改进吗?

        WITH categorystructure (id, parentid, parentname, NAME, depth, visible) 
         AS (SELECT C.id, 
                    C.parentid, 
                    Cast('' AS NVARCHAR(500)) AS ParentName, 
                    C.NAME, 
                    0                         AS Depth, 
                    C.visible 
             FROM   category C 
             WHERE  id = @CategoryId 
             UNION ALL 
             SELECT ParentCategory.id, 
                    ParentCategory.parentid, 
                    categorystructure.NAME      AS ParentName, 
                    ParentCategory.NAME, 
                    categorystructure.depth + 1 AS Depth, 
                    ParentCategory.visible 
             FROM   category ParentCategory 
                    INNER JOIN categorystructure 
                            ON ParentCategory.id = categorystructure.parentid) 
    SELECT C.*, 
           Isnull(SC.stockcount, 0) AS StockCount, 
           CS.depth 
    FROM   categorystructure CS 
           INNER JOIN category C 
                   ON C.id = CS.id 
           LEFT JOIN (SELECT categoryid, 
                             Count(*) AS StockCount 
                      FROM   stock_category SC 
                             INNER JOIN stock S 
                                     ON S.id = SC.stockid 
                      WHERE  S.published = 1 
                             AND ( @AustPostShippingEnabled = 0 
                                    OR ( S.widthmm IS NOT NULL 
                                         AND S.heightmm IS NOT NULL 
                                         AND S.depthmm IS NOT NULL 
                                         AND S.weightg IS NOT NULL ) ) 
                      GROUP  BY categoryid) SC 
                  ON SC.categoryid = CS.id 
    WHERE  ( @IncludeSelf = 1 
              OR CS.id != @CategoryId ) 
    GROUP  BY C.id, 
              C.posdissectionid, 
              C.posfamilyclassid, 
              C.parentid, 
              C.code, 
              C.NAME, 
              C.description, 
              C.azureid, 
              C.extension, 
              C.visible, 
              C.orderindex, 
              SC.stockcount, 
              CS.depth 
    HAVING Sum(CONVERT(INT, CS.visible)) = Count(*) 
    

    【讨论】:

    • HAVING Sum(CONVERT(INT, CS.visible)) = Count(*) = WHERE CS.visible = cast(1 as bit),只是说 :) 这样你就可以摆脱群组了。
    • @Anand 感谢这个建议,但我不相信你是正确的。 WHERE CS.visible = cast(1 as bit) 如何完成与 Sum(CONVERT(INT, CS.visible)) = Count(*) 相同的事情?你确定你理解意图吗?
    • 如果将布尔值转换为整数并将其列求和,然后与记录数进行比较,它们匹配的唯一方法是所有行都为真。不难看出;如果您接受了上面的建议,您能否为有用的答案投票?谢谢:)
    【解决方案3】:

    两件事:

    1. 将 in 子句更改为内连接:

      SELECT CategoryId, COUNT(*) AS StockCount 
      FROM Stock_Category SC
      INNER JOIN Stock S ON S.Id = SC.StockId
      WHERE S.Published = 1 AND 
      ((S.WidthMM IS NOT NULL AND 
      S.HeightMM IS NOT NULL AND 
      S.DepthMM IS NOT NULL AND
      S.WeightG IS NOT NULL)) AND
      CategoryId IN(SELECT CategoryId FROM CategoryStructure)
      GROUP BY CategoryId
      

    到-

    SELECT CategoryId, COUNT(*) AS StockCount 
    FROM Stock_Category SC
    INNER JOIN Stock S ON S.Id = SC.StockId
    inner join CategoryStructure as CS
    on CS.CategoryId = SC.CategoryId
    WHERE S.Published = 1 AND 
    ((S.WidthMM IS NOT NULL AND 
    S.HeightMM IS NOT NULL AND 
    S.DepthMM IS NOT NULL AND
    S.WeightG IS NOT NULL)) AND
    GROUP BY CategoryId
    
    1. 您的查询主要花时间在 IX_StockAllColumns 上进行索引查找。如果确实是所有列的非聚集索引,请在 Published、WidthMM、HeightMM、DepthMM 和 WeightG 列上创建新的非聚集索引。

    【讨论】:

    • 我认为您不太了解原始查询、问题、架构或表值函数。
    • INNER JOIN替换IN子句不会影响查询计划
    猜你喜欢
    • 2021-02-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-01-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多