【问题标题】:Increase performance when searching on concatenated columns在连接列上搜索时提高性能
【发布时间】:2015-02-03 12:02:35
【问题描述】:

我有一个 2008 SQL 服务器,有一个大表,我需要对多个列进行 COUNT DISTINCT 查询,组合起来。有些列是 varchar,有些是 int。

到目前为止的查询如下所示:

SELECT 
    CAST(datepart(yyyy, [HistDate]) as varchar(4)) + '-' + CAST(datepart(mm, [HistDate]) as varchar(2)) + '-1' AS [DateSelector], 
    [Document] AS [Document], 
    -- This is the bit that needs optimizing
    COUNT( DISTINCT(
    Document + 
    Reference + 
    CONVERT(varchar(20),BatchID) +              -- this is an int
    ISNULL(CONVERT(varchar(20),ResetCount),'')) -- this is an int
FROM documents
GROUP BY
    CAST(datepart(yyyy, [HistDate]) as varchar(4)) + '-' + CAST(datepart(mm, [HistDate]) as varchar(2)) + '-1' AS [DateSelector], 
    [Document] AS [Document], 
ORDER BY ...

目前这个查询需要 23 秒,而用 COUNT(*) 替换上面的 COUNT 需要几秒钟。我尝试添加一个产生 0 改进的组合索引。我可以做什么样的优化来加快搜索速度?

【问题讨论】:

  • 我建议为 DateSelector 创建一个计算列并为其编制索引。但这也会增加 DML 操作的开销。表的修改/检索率是多少?
  • @VishalGajjar 虽然这可能对性能有所帮助,但根据问题和标题,性能问题在代码的其他地方

标签: sql-server


【解决方案1】:

您可能可以通过使用来改善时间

group by datepart(yyyy, Zeitstempel), datepart(mm, Zeitstempel)

您可以只对整数进行分组而不进行转换,并且仍然在选择中使用它。

【讨论】:

    【解决方案2】:

    连接列不会提高性能。

    试试这个:

    ;WITH CTE AS
    (
      SELECT 
        [HistDate],
        [Document] AS [Document], 
        row_number() over (partition by Document, Reference + BatchID + ResetCount order by (select 1)) rn
      FROM documents
    )
    SELECT
      convert(char(8),dateadd(mm, 
        datediff(mm, 0, [HistDate]), 0), 126)+'1' AS [DateSelector], 
      [Document] AS [Document],
      count(*) as cnt
    FROM CTE
    WHERE rn = 1
    GROUP BY
      -- note you cant name a column in group by
      dateadd(month, datediff(month, 0, [HistDate]), 0),
      [Document]
    

    【讨论】:

      猜你喜欢
      • 2018-12-10
      • 2012-06-16
      • 2022-06-17
      • 1970-01-01
      • 2012-06-08
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多