WITH data as (
select time::timestamp as time, value from values
('2020-03-10 9:50', 1 ),
('2020-03-10 9:51', 3 ),
('2020-03-10 9:52', 1 ),
('2020-03-10 9:53', 2 ),
('2020-03-10 9:54', 0 ),
('2020-03-10 9:55', 0 ),
('2020-03-10 9:56', 1 ),
('2020-03-10 9:57', 3 ),
('2020-03-10 9:58', 2 ),
('2020-03-10 9:59', 3 ),
('2020-03-10 10:00', 2 ),
('2020-03-10 10:01', 2 ),
('2020-03-10 10:02', 0 ),
('2020-03-10 10:03', 3 ),
('2020-03-10 10:04', 1 ),
('2020-03-10 10:05', 1 ),
('2020-03-10 10:06', 1 )
s( time, value)
)
select
a.time
,a.value
,min(trig_time)over(partition by reset_time_group order by time) as first_trigger_time
,iff(a.time=first_trigger_time, datediff('minute', first_trigger_time, reset_time_group), null) as trig_duration
from (
select d.time
,d.value
,iff(d.value>=3,d.time,null) as trig_time
,iff(d.value=0,d.time,null) as reset_time
,max(time)over(order by time ROWS BETWEEN 1 PRECEDING AND UNBOUNDED FOLLOWING) as max_time
,coalesce(lead(reset_time)ignore nulls over(order by d.time), max_time) as lead_reset_time
,coalesce(reset_time,lead_reset_time) as reset_time_group
from data as d
) as a
order by time;
这给出了您似乎期望/描述的结果..
TIME VALUE FIRST_TRIGGER_TIME TRIG_DURATION
2020-03-10 09:50:00.000 1
2020-03-10 09:51:00.000 3 2020-03-10 09:51:00.000 3
2020-03-10 09:52:00.000 1 2020-03-10 09:51:00.000
2020-03-10 09:53:00.000 2 2020-03-10 09:51:00.000
2020-03-10 09:54:00.000 0 2020-03-10 09:51:00.000
2020-03-10 09:55:00.000 0
2020-03-10 09:56:00.000 1
2020-03-10 09:57:00.000 3 2020-03-10 09:57:00.000 5
2020-03-10 09:58:00.000 2 2020-03-10 09:57:00.000
2020-03-10 09:59:00.000 3 2020-03-10 09:57:00.000
2020-03-10 10:00:00.000 2 2020-03-10 09:57:00.000
2020-03-10 10:01:00.000 2 2020-03-10 09:57:00.000
2020-03-10 10:02:00.000 0 2020-03-10 09:57:00.000
2020-03-10 10:03:00.000 3 2020-03-10 10:03:00.000 3
2020-03-10 10:04:00.000 1 2020-03-10 10:03:00.000
2020-03-10 10:05:00.000 1 2020-03-10 10:03:00.000
2020-03-10 10:06:00.000 1 2020-03-10 10:03:00.000
所以它的工作原理是我们找到触发时间和重置时间,然后计算出最后一行边缘情况的 max_time。之后我们找到下一个reset_time向前,如果没有就使用max_time,然后选择当前的reset时间或之前的lead_reset_time,对于你在这里做的工作,这一步可以忽略,因为你的数据不能触发和重置同一行。鉴于我们正在对触发行进行数学运算,重置行知道它属于哪个组并不重要。
然后我们进入一个新的选择层,因为我们已经达到了嵌套/相关 SQL 的雪花限制,并在 reset_group 上做一分钟以找到第一个触发时间,然后我们将其与行时间进行比较并做一个日期差异。
附注 date_diff 的数学有点幼稚,'2020-01-01 23:59:59' '2020-01-02 00:00:01' 相隔 2 秒,但那是 1 分钟相隔 1 小时和 1 天,因为该函数将时间戳转换为选定的单位(并截断),然后对这些结果进行区分..
要获得请求中要求的值为 4 的最终批次,请将lead_reset_time 行更改为:
,coalesce(lead(reset_time)ignore nulls over(order by d.time), dateadd('minute', 1, max_time)) as lead_reset_time
将此 max_time 向前移动一分钟,如果您想假设在未来有数据之外,10:06 的现有行状态有效 1 分钟。这不是我会怎么做的......但是你想要的代码......