【问题标题】:SQL Server PIVOT on multiple columns is losing data多列上的 SQL Server PIVOT 正在丢失数据
【发布时间】:2017-08-17 14:18:42
【问题描述】:

我在这个问题上花了太多时间,但我还没有到达终点。请先通读一遍,然后再得出结论,这是所有其他枢轴的副本,在 SO 上有多个列。

我们有属性和单位,并有一个表格来跟踪单位中发生变化的时间。我们无法更改表的结构,因为这是供应商应用程序。

目标:提取单元的“型号”代码不可用时的开始日期和结束日期。

问题:我需要过滤掉中间可用的日期,尽管这似乎每次都省略了一行数据(对于单元 105)。

我的尝试PIVOT, CROSS APPLYLEAD/LAG 结合使用

这是一个指向 SQLFiddle 的链接:http://sqlfiddle.com/#!6/29592/2/0

问题的其余部分有来自 SQLfiddle 的 tsql,包括我得到的结果。想要的结果在最后。

创建表并插入示例数据

DROP TABLE IF EXISTS testModelUnit; 
CREATE TABLE testModelUnit(
    propertykey         INT             NOT NULL
    ,unitNumber         VARCHAR(10)     NOT NULL
    ,rowStartDate       DATETIME        NOT NULL
    ,rowEndDate         DATETIME        NOT NULL
    ,unavailableCode    varchar(10)     NULL
    ,CONSTRAINT pk_testModelUnit PRIMARY KEY (propertykey, unitNumber, rowStartDate )
)
GO

INSERT INTO testModelUnit VALUES 

(33,'105',  '2010-11-11 00:00:00.000','2016-11-11 00:00:00.000','MODEL')
,(33,'105', '2016-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
,(33,'105', '2016-12-14 07:51:03.307','2017-01-01 00:00:00.000',NULL)
,(33,'105', '2017-01-01 00:00:00.00','2017-03-21 12:21:13.703','MODEL')
,(33,'105', '2017-03-21 12:21:13.703','2017-04-21 12:21:13.703','MODEL')
,(33,'105', '2017-04-21 12:21:13.703','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-21 12:21:23.207','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-19 10:30:09.227','2017-04-21 12:21:23.207','MODEL')
,(33,'2703','2016-12-14 07:51:03.307','2017-04-19 10:29:47.970','MODEL')
,(33,'2703','2011-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')

GO 

这为您提供了测试它所需的所有数据,因为单元 105 在 2016 年底的短时间内可用。

尝试 1 - 使用 LEAD/LAG 确定日期是否是系列中的第一个日期 - 然后使用多个 PIVOT 语句

SELECT
    propertykey         
    ,unitNumber     
    ,firstDate
    ,lastDate   
FROM (
    SELECT 
        propertykey         
        ,unitNumber         
        ,rowStartDate       
        ,rowEndDate     
        ,CASE 
            WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL            
            ELSE 'firstDate'
        END ISFIRST
        ,CASE 
            WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL           
            ELSE 'lastDate'
        END ISLAST
    FROM testModelUnit
    WHERE UnavailableCode = 'model'
) SRC
PIVOT (
    MAX(rowStartDate)
    FOR isfirst in ([firstDate])
) as pivotFirst
PIVOT (
    MAX(rowEndDate)
    FOR islast in ([lastDate])
) as pivotLast

结果是:

propertykey  unitNumber  firstDate                  lastDate
33           105         NULL                       9999-12-31 00:00:00.000
33           105         2010-11-11 00:00:00.000    NULL
33           105         2017-01-01 00:00:00.000    NULL
33           2606        NULL                       9999-12-31 00:00:00.000
33           2606        2017-04-19 10:30:09.227    NULL
33           2703        NULL                       2017-04-19 10:29:47.970
33           2703        2011-11-11 00:00:00.000    NULL

问题是双重的:首先,我在不同的行中有 NULL,其次,我错过了单元 105 的结束日期(通过颠倒两个数据透视语句的顺序,我颠倒了问题,然后我错过了开始日期)

第二次尝试:像以前一样使用LAG/LEAD,但这次使用CROSS APPLY 将第一个/最后一个值放入一列,然后旋转结果

SELECT 
    propertykey
    ,unitNumber
    ,firstDate
    ,lastDate
FROM(
    SELECT
        propertykey
        ,unitNumber
        ,ca.col
        ,ca.value       
    FROM 
    (
        SELECT 
            propertykey         
            ,unitNumber         
            ,rowStartDate       
            ,rowEndDate     
            ,CASE 
                WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL            
                ELSE 'firstDate'
            END ISFIRST
            ,CASE 
                WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL           
                ELSE 'lastDate'
            END ISLAST
        FROM testModelUnit
        WHERE UnavailableCode = 'model'
    ) sub
    OUTER APPLY (
        SELECT ISFIRST, rowStartDate
        UNION ALL
        SELECT ISLAST, rowEndDate
    ) CA (col, value)
    WHERE col IS NOT NULL
)src
PIVOT
(
    max(value)
    for col in ([firstDate],[lastDate])
) AS pivoted

结果:

propertykey  unitNumber firstDate                 lastDate
33           105        2017-01-01 00:00:00.000   9999-12-31 00:00:00.000
33           2606       2017-04-19 10:30:09.227   9999-12-31 00:00:00.000
33           2703       2011-11-11 00:00:00.000   2017-04-19 10:29:47.970

问题:我删除了 NULL 行,但我仍然缺少 105 的一条数据记录

想要的结果:

propertykey      unitNumber firstDate                 lastDate
    33           105        2010-11-11 00:00:00.000   2016-12-14 07:51:03.307
    33           105        2017-01-01 00:00:00.000   9999-12-31 00:00:00.000
    33           2606       2017-04-19 10:30:09.227   9999-12-31 00:00:00.000
    33           2703       2011-11-11 00:00:00.000   2017-04-19 10:29:47.970

【问题讨论】:

    标签: sql-server pivot sql-server-2016


    【解决方案1】:

    您是否正在查看如下查询?

    Select PropertyKey, UnitNumber, Min(RowStartDate) as FirstDate, Max(rowEndDate) as LastDate from (
        Select *, Bucket = Row_number() over(partition by propertykey, unitnumber order by rowStartDate) - 
                Row_number() over(partition by propertykey, unitnumber, unavailablecode order by rowStartDate) 
        from testModelUnit
    ) a
    Where a.unavailableCode is not null
    group by propertykey, unitNumber, Bucket
    

    输出如下:

    +-------------+------------+-------------------------+-------------------------+
    | PropertyKey | UnitNumber |        FirstDate        |        LastDate         |
    +-------------+------------+-------------------------+-------------------------+
    |          33 |        105 | 2010-11-11 00:00:00.000 | 2016-12-14 07:51:03.307 |
    |          33 |        105 | 2017-01-01 00:00:00.000 | 9999-12-31 00:00:00.000 |
    |          33 |       2606 | 2017-04-19 10:30:09.227 | 9999-12-31 00:00:00.000 |
    |          33 |       2703 | 2011-11-11 00:00:00.000 | 2017-04-19 10:29:47.970 |
    +-------------+------------+-------------------------+-------------------------+
    

    Demo

    【讨论】:

    • 太棒了!避免所有滞后/领先的东西。少=多。谢谢!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-07-12
    • 2020-07-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多