【发布时间】:2023-04-02 23:38:01
【问题描述】:
我有一张这样的桌子:
id time activity
1: 1 1 a
2: 1 2 a
3: 1 3 b
4: 1 4 b
5: 1 5 a
6: 2 1 a
7: 2 2 b
8: 2 3 b
9: 2 4 b
10: 2 5 a
11: 2 6 a
12: 2 7 c
13: 2 8 c
14: 2 9 c
在每个id 中,我想按activity 的连续块进行聚合。所以基本上我想要一个像这样的grouping 列:
id time activity grouping
1: 1 1 a 1
2: 1 2 a 1
3: 1 3 b 2
4: 1 4 b 2
5: 1 5 a 3
6: 2 1 a 1
7: 2 2 b 2
8: 2 3 b 2
9: 2 4 b 2
10: 2 5 a 3
11: 2 6 a 3
12: 2 7 c 4
13: 2 8 c 4
14: 2 9 c 4
这样我就可以使用聚合函数并得到这样的东西:
select id
, min(time) as min_time
, max(time) as max_time
, count(*) as n_activity
from A
group by id, grouping
id min_time max_time n_activity
1: 1 1 2 2
2: 1 3 4 2
3: 1 5 5 1
4: 2 1 1 1
5: 2 2 4 3
6: 2 5 6 2
7: 2 7 9 3
如何创建分组列?我的表很大,所以我希望尽可能避免使用游标函数。
一些样本数据:
create table A (id int, time int, activity varchar);
insert into A (id, time, activity)
values
(1,1,'a'),(1,2,'a'),(1,3,'b'),(1,4,'b'),(1,5,'a'),(2,1,'a'),
(2,2,'b'),(2,3,'b'),(2,4,'b'),(2,5,'a'),(2,6,'a'),(2,7,'c'),
(2,8,'c'),(2,9,'c')
【问题讨论】:
标签: sql amazon-redshift