获取每列的最新值答案

【问题标题】：Get the latest value for each column获取每列的最新值
【发布时间】：2017-10-26 17:24:10
【问题描述】：

假设我的 SQL Server (2012) DB 中有以下“值”表：

表 1：

Id   Col1   Col2   Col3   Col4

我想创建第二个“覆盖”表，该表将存储值以覆盖原始值，以防用户需要这样做。因此，根据上表，覆盖表如下所示：

覆盖：

FK_Id   Col1   Col2   Col3   Col4   When_Inserted

其中Overrides.FK_Id 引用Table1.Id 作为外键。

因此，例如，假设我的Overrides 表中包含以下行，其中Table1 中的行被Id=1 覆盖：

FK_Id:     Col1:             Col2:             Col3:              Col4:       When_Inserted:
1          Val1_1            Val2_1            Expected_Val3      NULL        1-Jan
1          NULL              Val2_2            NULL               NULL        2-Jan
1          NULL              Expected_Val2     NULL               NULL        3-Jan
1          Expected_Val1     NULL              NULL               NULL        4-Jan

然后，基于 When_Inserted 列 - 希望最新插入优先，我希望覆盖如下：

FK_Id:     Col1:             Col2:             Col3:              Col4:
1          Expected_Val1     Expected_Val2     Expected_Val3      NULL

我试图想出一种聪明的方法来创建这个 SQL，并想出了一个相当丑陋的解决方案：

SELECT
     FK_Id
    ,(
        SELECT TOP 1
            Col1
        FROM
            Overrides O1
        WHERE
            Col1 IS NOT NULL
            AND O1.FK_Id = O.FK_Id
        ORDER BY
            O1.When_Inserted DESC
      ) Col1

      ....  <same for each of the other columns>  ....

FROM
    Overrides O
GROUP BY
    FK_Id

我确信必须有一种更清洁、更高效的更好方法。

【问题讨论】：

检查我的脚本，让我知道所有样本数据的输出

标签： sql sql-server group-by

【解决方案1】：

使用common table expression 和row_number()（最新的优先）、cross apply() 来取消透视您的列，过滤每列的最新内容 (rn = 1)，最后将pivot() 恢复为相同的形式：

;with cte as (
select o.fk_id, v.Col, v.Value, o.When_Inserted
  , rn = row_number() over (partition by o.fk_id, v.col order by o.when_inserted desc)
from overrides o
  cross apply (values('Col1',Col1),('Col2',Col2),('Col3',Col3),('Col4',Col4)
    ) v (Col,Value)
where v.value is not null
)
select fk_id, col1, col2, col3, col4
from (
  select fk_id, col, value
  from cte 
  where rn = 1
  ) s
pivot (max(Value) for Col in (col1,col2,col3,col4)) p

rextester 演示：http://rextester.com/KGM96394

+-------+---------------+---------------+---------------+------+
| fk_id |     col1      |     col2      |     col3      | col4 |
+-------+---------------+---------------+---------------+------+
|     1 | Expected_Val1 | Expected_Val2 | Expected_Val3 | NULL |
+-------+---------------+---------------+---------------+------+

dbfiddle.uk demo comparison of 3 methods

查看示例的 io 统计信息：

unpivot/pivot 版本：

Table 'Worktable'. Scan count 0, logical reads 0
Table 'overrides'. Scan count 1, logical reads 1

first_value over()版本：

Table 'Worktable'. Scan count 20, logical reads 100
Table 'overrides'. Scan count 1, logical reads 1

select top 1子查询版本：

Table 'overrides'. Scan count 5, logical reads 5
Table 'Worktable'. Scan count 0, logical reads 0

【讨论】：

喜欢比较！！ - 非常感谢你！...这将是我第一次使用 PIVOT，所以绝对是有趣的东西！ - 谢谢你！！只是想看看是否有其他解决方案通过....
@JohnBustos 乐于助人！
我看的越多，我就越意识到它是多么的绝妙——这只是超出了我目前的知识基础的联赛，但是太棒了……谢谢！ !
@JohnBustos 这是一个dbfiddle.uk demo，它打破了流程，分三个步骤向您展示了正在发生的事情。
@JohnBustos 是的，这取决于您将如何处理这些数据。如果转换为 varchar() 不是问题，那么您可以这样做：rextester.com/WXIB19613。如果这不是一个好的解决方案，那么您可以使用其他方法之一，或者将其与处理有问题的列的另一种方法结合使用。

【解决方案2】：

你可以使用first_value():

select distinct fkid,
       first_value(col1) over (partition by fkid
                               order by (case when col1 is not null then 1 else 2 end),
                                        when_inserted desc
                              ) as col1,
       first_value(col2) over (partition by fkid
                               order by (case when col2 is not null then 1 else 2 end),
                                        when_inserted desc
                              ) as col2,
       . . .
from t;

select distinct 是因为 SQL Server 没有与聚合函数等效的功能。

【讨论】：

谢谢！！ - 这和我的想法差不多，但看起来很痛苦 - 没有真正的捷径/更好的方法你能想到，对吧？？

【解决方案3】：

看到我的解决方案完全不同。

IMHO，我的脚本performance 会更好，只要它在所有sample data 中提供correct output。

我在我的脚本中使用了自动生成的 ID，但如果您没有身份 ID，那么您可以使用 ROW_NUMBER 。而且我的脚本很容易理解。

declare @t table(id int identity(1,1),FK_Id int,Col1 varchar(50),Col2 varchar(50)
,Col3 varchar(50),Col4 varchar(50),When_Inserted date)
insert into @t VALUES
 (1 ,'Val1_1'  ,'Val2_1' ,'Expected_Val3',  NULL ,  '2017-01-1')
,(1 ,NULL     ,'Val2_2' , NULL ,  NULL,       '2017-01-2')
,(1 ,NULL     ,'Expected_Val2', NULL ,         NULL, '2017-01-3')
,(1 ,'Expected_Val1' , NULL  , NULL ,         NULL,  '2017-01-4')



;

WITH CTE
AS (
    SELECT *
        ,CASE 
            WHEN col1 IS NULL
                THEN NULL
            ELSE CONCAT (
                    cast(id AS VARCHAR(10))
                    ,'_'
                    ,col1
                    )
            END col1Code
        ,CASE 
            WHEN col2 IS NULL
                THEN NULL
            ELSE CONCAT (
                    cast(id AS VARCHAR(10))
                    ,'_'
                    ,col2
                    )
            END col2Code
        ,CASE 
            WHEN col3 IS NULL
                THEN NULL
            ELSE CONCAT (
                    cast(id AS VARCHAR(10))
                    ,'_'
                    ,col3
                    )
            END col3Code
        ,CASE 
            WHEN col4 IS NULL
                THEN NULL
            ELSE CONCAT (
                    cast(id AS VARCHAR(10))
                    ,'_'
                    ,col4
                    )
            END col4Code
    FROM @t
    )
    ,CTE1
AS (
    SELECT FK_Id
        ,max(col1Code) col1Code
        ,max(col2Code) col2Code
        ,max(col3Code) col3Code
        ,max(col4Code) col4Code
    FROM cte
    GROUP BY FK_Id
    )
SELECT FK_Id
    ,SUBSTRING(col1Code, charindex('_', col1Code) + 1, len(col1Code)) col1Code
    ,SUBSTRING(col2Code, charindex('_', col2Code) + 1, len(col2Code)) col2Code
    ,SUBSTRING(col3Code, charindex('_', col3Code) + 1, len(col2Code)) col3Code
    ,SUBSTRING(col4Code, charindex('_', col4Code) + 1, len(col4Code)) col4Code
FROM cte1 c1

【讨论】：