【问题标题】:Tuning a group by thats running slowly通过运行缓慢的方式调整组
【发布时间】:2016-07-19 08:02:33
【问题描述】:

以下 SQL 在具有 350,000 条记录的最大表 PremiseProviderBillings 的数据库上需要 5 秒。但是在有 150 万条记录的同一个数据库上,它需要一分钟多的时间

SELECT
   n.CustomerInvoiceNumberId as InvoiceNo,C.CustomerBillId,c.customerid, S.Volumetric, S.Fixed, S.VAT, S.Discount, C.Debit,c.EffectiveDate,c.TransactionDateTime,s.Consumption,r.CustomerCreditNoteId--,s.Volumetric + s.Fixed + s.Vat - s.discount - c.debit as variance
FROM
  CustomerPayments C 
INNER JOIN
  (SELECT
     CustomerBillId, SUM(a.VolumetricCharge) as Volumetric,SUM(a.FixedCharge) as Fixed,
     SUM(a.VAT) as VAT,SUM(a.Discount) as Discount,sum(a.EstimatedConsumption) as Consumption
   FROM
     PremiseProviderBillings a, PremiseProviderBills b
    WHERE a.PremiseProviderBillId = b.PremiseProviderBillId
   GROUP BY
     CustomerBillId) S
ON
  C.CustomerBillId = S.CustomerBillId 
  and debit <> 0 -- hide credit note lines, we mark these results with customerCreditNoteId to show they have been credited
INNER JOIN dbo.CustomerInvoiceNumbers n on c.CustomerBillId = n.CustomerBillId
left OUTER JOIN
           dbo.CustomerCreditNotes AS r ON c.CustomerPaymentId = r.CustomerPaymentId
where isnull(c.transactionDateTimeEnd,'')=''

如果我随后运行内部 SQL 部分,将较小数据库上的值相加,则需要 2 秒。在更大的数据库上需要 34 秒,下面的内部 SQL...

SELECT
     CustomerBillId, SUM(a.VolumetricCharge) as Volumetric,SUM(a.FixedCharge) as Fixed,
     SUM(a.VAT) as VAT,SUM(a.Discount) as Discount,sum(a.EstimatedConsumption) as Consumption
   FROM
     PremiseProviderBillings a, PremiseProviderBills b
    WHERE a.PremiseProviderBillId = b.PremiseProviderBillId
   GROUP BY
     CustomerBillId

因此很明显,此 SQL 根本无法扩展。鉴于数据库会增长,应该采用什么技术来改进这一点?

我已经检查了所有的连接以确保没有丢失的索引,嗯,以确保所有的连接都是基于键的并且没问题

我原以为这种方法没问题,但我应该改变 SQL 的结构吗,这是否不可扩展且效率低下?

问候

【问题讨论】:

  • PremiseProviderBillings 和 PremiseProviderBills 中的索引是什么?查询计划 + 统计 IO 输出也可以帮助解决这个问题。
  • 当您的分组按某个值进行时,SQL 会尝试执行不同的操作以消除重复项...所以分组按列应该被索引或至少您应该提示 SQL 说明此列是唯一的。下一步将是总和列.. 它们是否包含在索引中作为覆盖
  • “最大的表 PremiseProviderBillings 有 350,000 条记录。但是在有 150 万条记录的同一个数据库上,它需要一分钟多的时间”,这里你不是只从这个表中获取数据。因此,根据连接输出的行数来衡量可伸缩性会更准确:“PremiseProviderBillings a, PremiseProviderBills b WHERE a.PremiseProviderBillId = b.PremiseProviderBillId”。连接输出的行数是否仍然:350K 和 1.5M?

标签: sql sql-server


【解决方案1】:

如果您经常使用查询,并且根据您写入表的频率,可能值得为此创建一个indexed view。然而值得注意的是,这是推测,索引视图确实需要权衡,您的读取速度会更快,但您的写入速度会更慢。

CREATE VIEW dbo.CustomerBillingView
WITH SCHEMABINDING
AS
    SELECT  b.CustomerBillId,
            SUM(a.VolumetricCharge) AS Volumetric,
            SUM(a.FixedCharge) AS Fixed,
            SUM(a.VAT) AS VAT,
            SUM(a.Discount) AS Discount,
            SUM(a.EstimatedConsumption) AS Consumption,
            COUNT_BIG(*) AS Records -- REQUIRED TO CREATE INDEX
    FROM    dbo.PremiseProviderBillings a
            INNER JOIN dbo.PremiseProviderBills b
                ON a.PremiseProviderBillId = b.PremiseProviderBillId
    GROUP BY b.CustomerBillId;
GO

CREATE UNIQUE CLUSTERED INDEX UQ_CustomerBillingView__CustomerBillId
    ON dbo.CustomerBillingView (CustomerBillId);

GO

那么你只需要使用带有提示NOEXPAND的视图来确保索引被使用。

SELECT  n.CustomerInvoiceNumberId as InvoiceNo,
        c.CustomerBillId,
        c.customerid, 
        s.Volumetric, 
        s.Fixed, 
        s.VAT, 
        s.Discount, 
        c.Debit,
        c.EffectiveDate,
        c.TransactionDateTime,
        s.Consumption,
        r.CustomerCreditNoteId
        --,s.Volumetric + s.Fixed + s.Vat - s.discount - c.debit as variance
FROM    CustomerPayments AS c 
        INNER JOIN dbo.CustomerBillingView AS s WITH (NOEXPAND)
            ON c.CustomerBillId = s.CustomerBillId 
            AND c.Debit <> 0 
            -- hide credit note lines, we mark these results with customerCreditNoteId to show they have been credited
        INNER JOIN dbo.CustomerInvoiceNumbers n 
            ON c.CustomerBillId = n.CustomerBillId
        LEFT OUTER JOIN dbo.CustomerCreditNotes AS r 
            ON c.CustomerPaymentId = r.CustomerPaymentId
WHERE   ISNULL(c.transactionDateTimeEnd,'') = '';

与每个查询调优问题一样,只有您拥有正确回答所需的所有信息。 根据我的经验(主要在计费系统中),像这样的索引视图通常可以处理计费数据,因为大多数发票运行都是周期性的, 所以写入是分批而不是连续的,并且读取也往往超过写入,因为数据是静态的,一旦创建发票,然后它 很少更新。

【讨论】:

  • 它说不能模式绑定视图'dbo.CustomerBillingView',因为名称'PremiseProviderBillings'对于模式绑定无效。名称必须采用两部分格式,并且当我尝试创建时,对象不能引用自身,任何想法,看起来很有希望您的解决方案,谢谢
  • 抱歉,我忘记将模式前缀添加到视图中使用的表中。为了创建索引,视图必须是模式绑定的,并且为了模式绑定,模式前缀必须在任何地方使用(尽管它是 good idea to use it everywhere anyway
  • 非常了不起,非常感谢 GarethD。每次执行 1 秒
【解决方案2】:

尝试为您的内部查询使用公用表表达式,它可能会加快速度。

WITH CTE AS
(
    SELECT
         CustomerBillId, SUM(a.VolumetricCharge) as Volumetric,SUM(a.FixedCharge) as Fixed,
         SUM(a.VAT) as VAT,SUM(a.Discount) as Discount,sum(a.EstimatedConsumption) as Consumption
       FROM
         PremiseProviderBillings a, PremiseProviderBills b
        WHERE a.PremiseProviderBillId = b.PremiseProviderBillId
       GROUP BY
         CustomerBillId
)   

SELECT
   n.CustomerInvoiceNumberId as InvoiceNo,C.CustomerBillId,c.customerid, S.Volumetric, S.Fixed, S.VAT, S.Discount, C.Debit,c.EffectiveDate,c.TransactionDateTime,s.Consumption,r.CustomerCreditNoteId--,s.Volumetric + s.Fixed + s.Vat - s.discount - c.debit as variance
FROM
  CustomerPayments C 
INNER JOIN
  CTE S
ON
  C.CustomerBillId = S.CustomerBillId 
  and debit <> 0 -- hide credit note lines, we mark these results with customerCreditNoteId to show they have been credited
INNER JOIN dbo.CustomerInvoiceNumbers n on c.CustomerBillId = n.CustomerBillId
left OUTER JOIN
           dbo.CustomerCreditNotes AS r ON c.CustomerPaymentId = r.CustomerPaymentId
where isnull(c.transactionDateTimeEnd,'')=''

【讨论】:

  • 这根本不会产生影响。无论派生表是子查询还是公用表表达式,执行计划都是相同的。
  • 你会用临时表代替@GarethD吗?
  • 我不会。有时具体化子查询会有所帮助,但考虑到子查询本身需要 34 秒,这不太可能大大提高速度。我认为答案在于索引,但对于优化问题,这主要是猜测工作,因为要考虑的事情太多了,其中大部分只有 OP 知道。
  • 哇,这有很大的不同,在 LARGE 数据库上返回结果只需 1 秒,谢谢!!!!!! :-)
  • 我说得太快了,它有时在大型数据集上运行得很快,有时需要 1 分钟,然后一直运行。很奇怪
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-10-24
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多