【问题标题】:Merge adjacent rows in SQL?合并SQL中的相邻行?
【发布时间】:2016-08-02 20:16:09
【问题描述】:

我正在根据员工工作的时间块进行一些报告。在某些情况下,数据包含两个单独的记录,实际上是一个时间块。

这是一个基本版本的表格和一些示例记录:

EmployeeID
StartTime
EndTime

数据:

EmpID      Start         End
----------------------------
#1001   10:00 AM    12:00 PM
#1001    4:00 PM     5:30 PM
#1001    5:30 PM     8:00 PM

在示例中,最后两条记录在时间上是连续的。我想写一个查询来组合任何相邻的记录,所以结果集是这样的:

EmpID      Start         End
----------------------------
#1001   10:00 AM    12:00 PM
#1001    4:00 PM     8:00 PM

理想情况下,它还应该能够处理超过 2 个相邻记录,但这不是必需的。

【问题讨论】:

  • 你还有存储日期的列吗?
  • @JeffRosenberg:是的。这些是真实表中的日期时间列。为了提问,这个示例表被大大简化了。

标签: sql sql-server-2008 tsql


【解决方案1】:

本文为您的问题提供了很多可能的解决方案

http://www.sqlmag.com/blog/puzzled-by-t-sql-blog-15/tsql/solutions-to-packing-date-and-time-intervals-puzzle-136851

这似乎是最直接的:

WITH StartTimes AS
(
  SELECT DISTINCT username, starttime
  FROM dbo.Sessions AS S1
  WHERE NOT EXISTS
    (SELECT * FROM dbo.Sessions AS S2
     WHERE S2.username = S1.username
       AND S2.starttime < S1.starttime
       AND S2.endtime >= S1.starttime)
),
EndTimes AS
(
  SELECT DISTINCT username, endtime
  FROM dbo.Sessions AS S1
  WHERE NOT EXISTS
    (SELECT * FROM dbo.Sessions AS S2
     WHERE S2.username = S1.username
       AND S2.endtime > S1.endtime
       AND S2.starttime <= S1.endtime)
)
SELECT username, starttime,
  (SELECT MIN(endtime) FROM EndTimes AS E
   WHERE E.username = S.username
     AND endtime >= starttime) AS endtime
FROM StartTimes AS S;

【讨论】:

    【解决方案2】:

    如果这是严格的相邻行(不是重叠行),您可以尝试以下方法:

    1. 取消透视时间戳。

    2. 只留下那些没有重复的。

    3. 将其余的转回,将每个Start 与直接跟随的End 耦合。

    或者,在 Transact-SQL 中,类似这样:

    WITH unpivoted AS (
      SELECT
        EmpID,
        event,
        dtime,
        count = COUNT(*) OVER (PARTITION BY EmpID, dtime)
      FROM atable
      UNPIVOT (
        dtime FOR event IN (StartTime, EndTime)
      ) u
    )
    , filtered AS (
      SELECT
        EmpID,
        event,
        dtime,
        rowno = ROW_NUMBER() OVER (PARTITION BY EmpID, event ORDER BY dtime)
      FROM unpivoted
      WHERE count = 1
    )
    , pivoted AS (
      SELECT
        EmpID,
        StartTime,
        EndTime
      FROM filtered
      PIVOT (
        MAX(dtime) FOR event IN (StartTime, EndTime)
      ) p
    )
    SELECT *
    FROM pivoted
    ;
    

    这个查询有一个演示at SQL Fiddle

    【讨论】:

      【解决方案3】:

      具有累积总和的 CTE:

      DECLARE @t TABLE(EmpId INT, Start TIME, Finish TIME)
      INSERT INTO @t (EmpId, Start, Finish)
      VALUES
          (1001, '10:00 AM', '12:00 PM'),
          (1001, '4:00 PM', '5:30 PM'),
          (1001, '5:30 PM', '8:00 PM')
      
      ;WITH rowind AS (
          SELECT EmpId, Start, Finish,
              -- IIF returns 1 for each row that should generate a new row in the final result
              IIF(Start = LAG(Finish, 1) OVER(PARTITION BY EmpId ORDER BY Start), 0, 1) newrow
          FROM @t),
          groups AS (
          SELECT EmpId, Start, Finish,
              -- Cumulative sum
              SUM(newrow) OVER(PARTITION BY EmpId ORDER BY Start) csum
          FROM rowind)
      
      SELECT
          EmpId,
          MIN(Start) Start,
          MAX(Finish) Finish
      FROM groups
      GROUP BY EmpId, csum
      

      【讨论】:

        【解决方案4】:

        我已经更改了一些名称和类型以使示例更小,但这可以工作并且应该非常快并且它没有记录数限制:

        with cte as (
          select 
            x1.id
            ,x1.t1
            ,x1.t2
            ,case when x2.t1 is null then 1 else 0 end as bef
            ,case when x3.t1 is null then 1 else 0 end as aft
          from x x1
          left join x x2 on x1.id=x2.id and x1.t1=x2.t2
          left join x x3 on x1.id=x3.id and x1.t2=x3.t1
          where x2.id is null
          or    x3.id is null
        )
        
        select 
          cteo.id
          ,cteo.t1
          ,isnull(z.t2,cteo.t2) as t2
        
        from cte cteo
        outer apply (select top 1 * 
                     from cte ctei 
                     where cteo.id=ctei.id and cteo.aft=0 and ctei.t1>cteo.t1
                     order by t1) z
        where cteo.bef=1
        

        和它的小提琴:http://sqlfiddle.com/#!3/ad737/12/0

        【讨论】:

          【解决方案5】:

          具有内联用户定义函数和 CTE 的选项

          CREATE FUNCTION dbo.Overlap
           (
            @availStart datetime,
            @availEnd datetime,
            @availStart2 datetime,
            @availEnd2 datetime
            )
          RETURNS TABLE
          RETURN
            SELECT CASE WHEN @availStart > @availEnd2 OR @availEnd < @availStart2
                        THEN @availStart ELSE
                                         CASE WHEN @availStart > @availStart2 THEN @availStart2 ELSE @availStart END
                                         END AS availStart,
                   CASE WHEN @availStart > @availEnd2 OR @availEnd < @availStart2
                        THEN @availEnd ELSE
                                       CASE WHEN @availEnd > @availEnd2 THEN @availEnd ELSE @availEnd2 END
                                       END AS availEnd
          
          ;WITH cte AS
           (
            SELECT EmpID, Start, [End], ROW_NUMBER() OVER (PARTITION BY EmpID ORDER BY Start) AS Id
            FROM dbo.TableName
            ), cte2 AS
           (
            SELECT Id, EmpID, Start, [End]
            FROM cte
            WHERE Id = 1
            UNION ALL
            SELECT c.Id, c.EmpID, o.availStart, o.availEnd
            FROM cte c JOIN cte2 ct ON c.Id = ct.Id + 1
                       CROSS APPLY dbo.Overlap(c.Start, c.[End], ct.Start, ct.[End]) AS o
            )
            SELECT EmpID, Start, MAX([End])
            FROM cte2
            GROUP BY EmpID, Start
          

          SQLFiddle上的演示

          【讨论】:

            猜你喜欢
            • 2013-06-10
            • 1970-01-01
            • 1970-01-01
            • 2016-05-28
            • 2012-05-20
            • 1970-01-01
            • 2012-02-28
            • 1970-01-01
            • 2012-11-26
            相关资源
            最近更新 更多