【问题标题】:Simplify the timeline in SQL(Netezza)简化 SQL (Netezza) 中的时间线
【发布时间】:2015-07-14 07:59:29
【问题描述】:

我想总结/简化(我不知道该怎么称呼)我的时间线。

所以我拥有的是带有时间线的 ID。我正在尝试摆脱同一 id 内的重叠时间线。

以下是数据示例。 我有什么:

ID   START_TIME   END_TIME
 1        a          b
 1        c          d
 1        e          f
 1        g          h


从图片中可以看出,[a,b],[c,d],[e,f] 相互重叠,[g,h] 不相交,所以我只想要[a,f][g,h]。 我想要什么:

ID   START_TIME   END_TIME
 1        a          f
 1        g          h

【问题讨论】:

  • @shA.t 你能给我一些重叠/不相交列的例子吗?

标签: mysql sql netezza


【解决方案1】:

我认为@shA.T 很接近。问题是在多个重叠中,这会崩溃。你可能需要把它变成一个多步骤的过程

第 1 步(制作样本表):

 create temp table stack (
 id integer
 ,start_time timestamp
 ,end_time timestamp
 )

 insert into stack  values(1, date('2020-01-01'),date('2020-01-01') + interval '3 hours');
 insert into stack  values(1,date('2020-01-01') + interval '2 hours',date('2020-01-01') + interval '4 hours');
 insert into stack  values(1,date('2020-01-01') + interval '3.5 hours',date('2020-01-01') + interval '5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '5.5 hours',date('2020-01-01') + interval '6.5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '7.5 hours',date('2020-01-01') + interval '9.5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '8.5 hours',date('2020-01-01') + interval '10.5 hours');

第 2 步(查找单个重叠):

create temp table stack2 as
 SELECT ID, ps2 as start_time, max(e) AS End_Time
    FROM (
        SELECT t1.ID, t1.START_TIME AS s, MAX(t1.END_TIME) AS e,
               max(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
               ,CASE WHEN pe between s and e THEN ps ELSE s END ps2
        FROM stack AS t1
        JOIN stack AS t2 ON t1.START_TIME > t2.START_TIME
        GROUP BY t1.ID, t1.START_TIME) AS DT
    GROUP BY
        ID, ps2
    ORDER BY ps2

第 3 步(合并双重重叠):

 SELECT ID, ps2 as start_time, max(e) AS End_Time
    FROM (
        SELECT t1.ID, t1.START_TIME AS s, MAX(t1.END_TIME) AS e,
               max(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
               ,CASE WHEN pe between s and e THEN ps ELSE s END ps2
        FROM stack2 AS t1
        JOIN stack2 AS t2 ON t1.START_TIME > t2.START_TIME
        GROUP BY t1.ID, t1.START_TIME) AS DT
    GROUP BY
        ID, ps2
    ORDER BY ps2

【讨论】:

  • 谢谢。这很好。
【解决方案2】:

我找到了一个解决方案,无需像这样添加任何额外的列:

SELECT ID, MIN(CASE WHEN pe between s and e THEN ps ELSE s END) AS START_TIME, MAX(e) AS End_Time
FROM (
    SELECT t1.ID, t1.START_TIME AS s, t1.END_TIME AS e, 
           MAX(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
    FROM t AS t1
    JOIN t AS t2 ON t1.START_TIME > t2.START_TIME
    GROUP BY t1.ID, t1.START_TIME, t1.END_TIME ) AS DT
GROUP BY
    ID, CASE WHEN pe between s and e THEN 1 ELSE 0 END
ORDER BY s

【讨论】:

  • 非常感谢,但它似乎需要一些调整。您介意在 GROUP BY 语句中解释“CASE WHEN PE between s and e THEN 1 ELSE 0 END”的作用吗?
  • @devon 它是那个额外的重叠/不相交列,这意味着当前一个结束时间(pe)在当前任务的持续时间(between s and e)之间时是一个新组;)。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-04-12
  • 2016-11-17
  • 1970-01-01
  • 2018-09-17
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多