【问题标题】:Collapse data ranges and overlapping data in SQL Server 2014SQL Server 2014 中的折叠数据范围和重叠数据
【发布时间】:2015-06-20 01:09:53
【问题描述】:

我有一个表格,其中的范围如下:

ID  ActionCode  Group1  Type    Low         High
33  A           840     MM      000295800   000295899
34  A           840     MM      000295900   000295999

我需要将具有连续数据的两行折叠成一行,例如上面将是

ActionCode  Group1  Type    Low         High
A           840     MM      000295800   000295999   

对于 ActionCode、Group1、Type...

可能存在重叠的数据范围、前面的零等。

样本表:

IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
    DROP TABLE #TestTable

CREATE TABLE #TestTable(
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [ActionCode] [char](1) NOT NULL,
    [Group1] [varchar](50) NOT NULL,
    [Type] [varchar](2) NULL,
    [Low] [varchar](50) NOT NULL,
    [High] [varchar](50) NOT NULL,
    CONSTRAINT [PK_#TestTable] PRIMARY KEY CLUSTERED ([ID] ASC)  
) 

GO

INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299870','401299879')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','AA','401644000','401646999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401378000','401378999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401644000','401646999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299970','401299979')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','400424000','400424999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299990','401299996')
-- Ds
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','401198000','401198999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','401649000','401649999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299997','401299997')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','401376000','401390999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401655000','401668999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','400411000','400411999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','400414000','400414999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401646000','401646999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','400413000','400413999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401654000','401654999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892000','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892100','522892199')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892400','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892400','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892300','522892399')
-- Different Types overlap range
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892200','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892000','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892200','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892300','522892399')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892400','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892100','522892199')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892200','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892300','522892399')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892100','522892199')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892000','522892999')
-- Leading Zeros
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','MM','000295800','000295899')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','MM','000295900','000295999')
-- Overlap
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623295800','623295999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623295900','623295999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623295900','623296099')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623296100','623296299')
GO


SELECT * FROM #TestTable ORDER BY Low

我可以使用递归 CTE 对一个小表执行此操作,但该表的行数略少于一百万。但是一旦我超过一定的尺寸,就需要很长时间才能运行。 “分组”列上有一个索引。

一定有办法快速做到这一点,我只是遇到了障碍。

^

【问题讨论】:

    标签: tsql sql-server-2014


    【解决方案1】:

    我认为您正在寻找这样的东西:

    WITH CTE
    AS
    (
        SELECT  ActionCode,
                Group1,
                [Type],
                Low,
                High,
                next_low =  LEAD(low,1) OVER (PARTITION BY ActionCode,Group1,[Type] ORDER BY ID),
                next_high = LEAD(high,1) OVER (PARTITION BY ActionCode,Group1,[Type] ORDER BY ID)
        FROM #testTable
    )
    
    SELECT  ActionCode,
            Group1,
            [Type],
            Low,
            High
    FROM CTE
    WHERE       low != next_low 
            AND high!= next_high
    

    【讨论】:

    • 几乎,不完全是,例如类型 NN 应该折叠成 623295900 和 623296299。我认为你在正确的轨道上......我会努力解决这个问题......我想这是一个很好的垫脚石。
    【解决方案2】:

    我将假设您的 Low / High 都是整数,因此可能需要对您认为的 high 和 low 进行轻微调整。

    我还要假装我没有看到连续的部分,在这种情况下,一个简单的小组会处理它:

    SELECT
        ActionCode
        ,Group1
        ,Type
        ,min(convert(int,Low)) AS Low
        ,max(convert(int,High)) AS High
    FROM #TestTable
    GROUP BY
        ActionCode
        ,Group1
        ,Type
    

    假设您确实是基于 ID 对每个连续组的意思,这将成为一个经典的“间隙和孤岛”问题,可以通过与 ID 相比的行号来解决:

    ;WITH Src AS
        (
            SELECT
                *
                ,ID-ROW_NUMBER() OVER (PARTITION BY ActionCode, Group1,Type ORDER BY ID) AS ContiguousGroupID
            FROM #TestTable
        )
    
    SELECT
        ContiguousGroupID
        ,ActionCode
        ,Group1
        ,Type
        ,min(ID) AS LowerIDBound
        ,max(ID) AS UpperIDBound
        ,min(convert(int,Low)) AS Low
        ,max(convert(int,High)) AS High
    FROM Src
    GROUP BY
        ContiguousGroupID
        ,ActionCode
        ,Group1
        ,Type
    ORDER BY
        ContiguousGroupID
        ,ActionCode
        ,Group1
        ,Type
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2023-03-29
      • 2018-11-20
      • 2017-06-04
      • 2012-06-03
      • 2021-05-19
      • 1970-01-01
      • 2014-01-31
      • 1970-01-01
      相关资源
      最近更新 更多