【问题标题】:Speeding up cumulative sum calculation in SQL Server加快 SQL Server 中的累积和计算
【发布时间】:2019-09-17 18:18:29
【问题描述】:

作为某些解决方案构建的一部分,我必须实现一个执行运行总计(累计和计算)的视图。我采用了最简单和基本的方法,将表格与日期列表连接起来,但似乎视图仍然相当慢。在表上添加索引并没有多大帮助,即使表本身只有 15K 行左右。我想知道是否有人可以建议加快速度的正确方法是什么?

有几个考虑:

  1. 我需要计算特定ProjectIDContractorID 的累计总和。因此,对于同一日期,我可能有很多 ProjectID 和 ContractorIds 组合,但 Date、ProjectID 和 ContractorID 的组合始终是唯一的

  2. 有一个包含日期、projectids(但没有contractorids)的主表,我需要这个主日期表中每个日期和projectid 的累积总和

  3. 我需要同时计算几列的累积和,而不仅仅是一个列

为了让您稍微了解一下情况,我的表格是:

  • dbo.Project_Reporting_Schedule 包含 projectid、日期的主列表。对于每个组合,我需要根据另一个表计算累积和。请注意它没有承包商!

  • Project_value_delivery 是一个表,其中我有实际值列来执行累积和计算。它有自己的一组日期,这些日期可能与Project_Reporting_Schedule 中的日期匹配,也可能不匹配,因此我们不能只加入表格本身。另请注意它有contractorid!

目前我有以下选择,这是不言自明的,只是将表与表中的值与主日期列表连接起来并进行求和。 Select 运行良好,但即使只有 15K 条记录,也需要将近 5 秒才能运行,这相当慢。

select 
    pv2.ProjectID,
    pv2.ContractorID,
    pv1.Date, 
    sum(pv2.ValuePlanned) as PlannedCumulative, 
    sum(pv2.ValueActual) as ActualCumulative,
    sum(pv2.MobilizationPlanned) as MobilizationPlanned,
    sum(pv2.MobilizationActual) as MobilizationActual,
    sum(pv2.EngineeringPlanned) as EngineeringPlanned,
    sum(pv2.EngineeringActual) as EngineeringActual,
    sum(pv2.ProcurementPlanned) as ProcurementPlanned,
    sum(pv2.ProcurementActual) as ProcurementActual,
    sum(pv2.ConstructionPlanned) as ConstructionPlanned,
    sum(pv2.ConstructionActual) as ConstructionActual,
    sum(pv2.CommisioningTestingPlanned) as CommisioningTestingPlanned,
    sum(pv2.CommisioningTestingActual) as CommisioningTestingActual
from 
    dbo.Project_Reporting_Schedule as pv1
join 
    dbo.Project_value_delivery as pv2 on pv1.Date >= pv2.Date and pv1.ProjectID = pv2.ProjectID
group by 
    pv2.ProjectID, pv2.ContractorID, pv1.Date

更新

为了进一步澄清,我将执行计划放在这里: https://www.brentozar.com/pastetheplan/?id=H12t-O1PS

创建的索引是相同的,并且在两个表上我都有它们用于 Projectid、Date 组合以及 ProjectID 和 Date 列上的独立索引。

所有索引在适用的情况下都是唯一的非聚集索引,或者在适用的情况下只是非聚集索引。

我们可以看到它执行“非聚集索引查找”,这会花费大部分执行。也许索引需要调整?

【问题讨论】:

  • 如果尚未完成,您可能必须在 pv1.Date、pv2.Date、ContractorId 和 ProjectID 字段上定义索引
  • 请发布查询计划 (brentozar.com/pastetheplan)。还请包括您创建的索引的详细信息。
  • @Alex 完成了
  • 查看计划,我可以看到 Project_value_delivery 的表扫描。我不确定在这种情况下添加索引是否会有所帮助,因为您要对很多列求和。尝试使用窗口函数 (stackoverflow.com/a/13331102/6305294) 看看是否有帮助。
  • @Alex 谢谢,窗口函数确实是解决问题的一种方法——它们显着提高了速度。虽然在我的情况下如何正确应用它们并不是很清楚,但我想通了

标签: sql sql-server cumulative-sum


【解决方案1】:

好的,所以 @Alex 在 cmets 窗口函数中的建议是一种方法。与原始代码相比,以下代码的运行速度快如闪电:

select 
       pv2.ProjectID,
       pv2.ContractorID,
       pv1.Date, 
       sum(pv2.ValuePlanned) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as PlannedCumulative, 
       sum(pv2.ValueActual) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as ActualCumulative,
       sum(pv2.MobilizationPlanned) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as MobilizationPlanned,
       sum(pv2.MobilizationActual) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as MobilizationActual,
       sum(pv2.EngineeringPlanned) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as EngineeringPlanned,
       sum(pv2.EngineeringActual) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as EngineeringActual,
       sum(pv2.ProcurementPlanned) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as ProcurementPlanned,
       sum(pv2.ProcurementActual) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as ProcurementActual,
       sum(pv2.ConstructionPlanned) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as ConstructionPlanned,
       sum(pv2.ConstructionActual) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as ConstructionActual,
       sum(pv2.CommisioningTestingPlanned) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as CommisioningTestingPlanned,
       sum(pv2.CommisioningTestingActual) over (partition by pv2.ProjectID, pv2.ContractorID order by pv1.Date ROWS between unbounded preceding and current row) as CommisioningTestingActual
from 
       dbo.Project_Reporting_Schedule as pv1
       join dbo.Project_value_delivery as pv2 on pv1.Date = pv2.Date and pv1.ProjectID = pv2.ProjectID

【讨论】:

    【解决方案2】:

    JOIN 子句中取出比较并将其移至WHERE 子句:

    select 
           pv2.ProjectID,
           pv2.ContractorID,
           pv1.Date, 
           sum(pv2.ValuePlanned) as PlannedCumulative, 
           sum(pv2.ValueActual) as ActualCumulative,
           sum(pv2.MobilizationPlanned) as MobilizationPlanned,
           sum(pv2.MobilizationActual) as MobilizationActual,
           sum(pv2.EngineeringPlanned) as EngineeringPlanned,
           sum(pv2.EngineeringActual) as EngineeringActual,
           sum(pv2.ProcurementPlanned) as ProcurementPlanned,
           sum(pv2.ProcurementActual) as ProcurementActual,
           sum(pv2.ConstructionPlanned) as ConstructionPlanned,
           sum(pv2.ConstructionActual) as ConstructionActual,
           sum(pv2.CommisioningTestingPlanned) as CommisioningTestingPlanned,
           sum(pv2.CommisioningTestingActual) as CommisioningTestingActual
           FROM
           dbo.Project_Reporting_Schedule as pv1
           join dbo.Project_value_delivery as pv2 on pv1.ProjectID = pv2.ProjectID
           WHERE pv1.Date >= pv2.Date
           GROUP BY pv2.ProjectID, pv2.ContractorID, pv1.Date
    

    【讨论】:

    • 你认为这会改变执行计划吗?
    • 如果这会有所不同,您会感到惊讶,但请发布您的结果。
    • 对不起,没有真正帮助 - 执行时间是一样的,据我所知,执行计划看起来也一样
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-10-07
    • 2021-01-15
    • 2021-09-30
    • 2014-05-15
    • 1970-01-01
    相关资源
    最近更新 更多