【问题标题】:TSQL: Check columns for at least one non-null valueTSQL:检查列中至少一个非空值
【发布时间】:2015-07-17 04:17:28
【问题描述】:

我想编写一个 TSQL 查询,它独立检查表中的一组列,以查看哪些列至少包含一个非空值。每列的检查应相应地返回 T/F (1/0)。

首先想到的是使用COUNT 聚合函数。由于COUNT(expression) 从结果总数中排除空值,如果COUNT > 0,则存在非空数据。

这似乎有点笨拙,因为它必须计算所有数据。我真的只需要知道每列中是否至少有一个非空值:

    SELECT 
        CAST(CASE WHEN COUNT(t.Column1) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn1Data,
        CAST(CASE WHEN COUNT(t.Column2) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn2Data,
        CAST(CASE WHEN COUNT(t.Column3) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn3Data,
        CAST(CASE WHEN COUNT(t.Column4) > 0 THEN 1 ELSE 0 END AS BIT) AS HasColumn4Data
    FROM dbo.Table AS t
    WHERE t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp

有什么更好的想法吗?

【问题讨论】:

  • 我会省略cast(),只返回0和1。
  • 感谢您的回复,但为什么删除演员表会有影响?
  • 如果你想留在 case 你可以删除 case 表达式。在位数据类型中,任何 >= 1 的值都将变为 1。 CAST(COUNT(t.Column1) as bit)
  • 仅供参考:计数比我们自己编写的任何东西都要快得多。 (如果有可以利用的索引,可能除外。即使那样也不确定)

标签: sql sql-server tsql


【解决方案1】:

你可以试试这样的:

;WITH cte AS (
  SELECT * FROM dbo.Table WHERE TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp
)
SELECT COUNT(s1.Col1) as Col1, COUNT(s2.Col2) as Col2,
  COUNT(s3.Col3) as Col3, COUNT(s4.Col4) as Col4
FROM
  (SELECT TOP 1 Col1
   FROM cte
   WHERE Col1 IS NOT NULL) s1 CROSS JOIN
  (SELECT TOP 1 Col2
   FROM cte
   WHERE Col2 IS NOT NULL) s2 CROSS JOIN
  (SELECT TOP 1 Col3
   FROM cte
   WHERE Col3 IS NOT NULL) s3 CROSS JOIN
  (SELECT TOP 1 Col4
   FROM cte
   WHERE Col4 IS NOT NULL) s4

如果所有列都不为空,这具有潜在的优势。在这种情况下,表只被扫描到第一个非空行(但这样做 4 次......)。如果所有行的任何(或更糟的是,所有)列都为空,您将获得每列的完整扫描。总而言之,如果您的预期数据确实有值,这可能会很有用。

【讨论】:

    【解决方案2】:

    如果您在列上有索引,以下可能会更快:

    select (case when exists (select 1
                              from table t
                              where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                    column1 is not null
                             )
                 then 1 else 0 end) as HasColumn1Data,
           (case when exists (select 1
                              from table t
                              where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                    column2 is not null
                             )
                 then 1 else 0 end) as HasColumn2Data,
           (case when exists (select 1
                              from table t
                              where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                    column3 is not null
                             )
                 then 1 else 0 end) as HasColumn3Data,
           (case when exists (select 1
                              from table t
                              where t.TimeStamp BETWEEN @StartTimeStamp and @EndTimeStamp and
                                    column4 is not null
                             )
                 then 1 else 0 end) as HasColumn4Data;
    

    如果没有索引,这将是大约 4 次全表扫描(诚然,在第一个非 NULL 值处被截断),因此它可能会比 group by

    【讨论】:

      【解决方案3】:

      这可能最终会更麻烦,但使用 EXISTS 而不是 COUNT 可能更理想:

      SELECT  CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column1 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn1Data,
              CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column2 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn2Data,
              CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column3 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn3Data,
              CAST(CASE WHEN EXISTS(SELECT * FROM Table t WHERE t.Column4 IS NOT NULL AND t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp) THEN 1 ELSE 0 END AS BIT) AS HasColumn4Data
      

      【讨论】:

        【解决方案4】:

        我会将查询转为仅生成字段名称以及它们是否为非空:

        SELECT
          'COL1' AS column_name,
          CONVERT( BIT, COUNT( 1 ) ) AS is_not_entirely_null
        FROM
          foo
        WHERE
          column1 IS NOT NULL
        UNION
        SELECT
          'COL2' AS column_name,
          CONVERT( BIT, COUNT( 1 ) ) AS is_not_entirely_null
        FROM
          foo
        WHERE
          column2 IS NOT NULL
        

        ...

        顺便说一句,您应该能够使用以下内容自动生成上述查询:

        SELECT
          'SELECT ''' + c.name + ''' AS column_name, CONVERT( BIT, COUNT( 1 ) ) AS is_not_entirely_null FROM ' + t.name + ' WHERE ' + c.name + ' IS NOT NULL UNION'
        FROM 
          sysobjects AS t,
          syscolumns AS c
        WHERE
          t.name = 'foo' AND
          c.id = t.id
        

        【讨论】:

          【解决方案5】:

          你可以使用这个查询

          SELECT
          max(CASE WHEN t.Column1 IS NULL THEN 0 ELSE 1 END ) AS HasColumn1Data,
          max(CASE WHEN t.Column2 IS NULL THEN 0 ELSE 1 END ) AS HasColumn2Data,
          max(CASE WHEN t.Column3 IS NULL THEN 0 ELSE 1 END ) AS HasColumn3Data,
          max(CASE WHEN t.Column4 IS NULL THEN 0 ELSE 1 END ) AS HasColumn4Data,
          FROM dbo.Table AS t
          WHERE t.TimeStamp BETWEEN @StartTimeStamp AND @EndTimeStamp
          

          【讨论】:

          • 您已经删除了cast,因此会导致语法错误。但如果不是真的需要,我们也可以去掉as bit
          猜你喜欢
          • 2018-10-13
          • 1970-01-01
          • 2019-12-25
          • 2015-09-22
          • 2020-05-11
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多