【发布时间】:2021-12-26 14:30:59
【问题描述】:
我有一个包含 3 列的数据框:
-
'Date'每行唯一 -
'Group'每行所属的组 -
'ID'每行的标识符
在'ID' 列中有一些重复值。我想计算每个组和每个日期的重复次数。
数据框示例:
Date Group ID
0 2021-10-29 09:15:52 B 9352
1 2021-10-29 10:58:57 A 9352
2 2021-10-29 11:20:46 C 9352
3 2021-10-29 12:47:34 C 6274
4 2021-10-29 15:41:35 C 1677
5 2021-10-29 16:12:39 B 1677
6 2021-10-29 18:57:56 B 9225
7 2021-10-29 19:46:46 C 9225
8 2021-10-30 01:23:07 C 9225
9 2021-10-30 02:13:57 A 9225
10 2021-10-30 05:03:52 B 9329
11 2021-10-30 07:48:39 B 9329
12 2021-10-30 08:45:00 A 9329
13 2021-10-30 11:17:47 C 9329
14 2021-10-30 21:46:07 C 9496
15 2021-10-30 22:13:29 A 1218
16 2021-10-31 05:39:38 C 2422
17 2021-10-31 05:39:41 C 9654
18 2021-10-31 10:10:21 A 1951
19 2021-10-31 10:19:45 A 1951
20 2021-10-31 16:40:10 A 1951
21 2021-10-31 16:41:23 C 1951
22 2021-10-31 22:07:16 A 1951
23 2021-11-01 00:26:30 C 2867
24 2021-11-01 01:25:46 B 2867
25 2021-11-01 01:53:16 B 4262
26 2021-11-01 01:58:30 A 4581
27 2021-11-01 05:23:26 C 1734
28 2021-11-01 05:38:22 C 1734
29 2021-11-01 05:47:43 C 1734
30 2021-11-01 06:49:27 A 4813
31 2021-11-01 07:54:02 C 4813
32 2021-11-01 12:10:48 C 8661
33 2021-11-01 14:32:50 C 4138
34 2021-11-01 18:38:16 B 4138
35 2021-11-01 21:33:37 C 4138
36 2021-11-02 02:34:53 B 4138
37 2021-11-02 04:45:56 C 4138
38 2021-11-02 07:33:38 C 4138
39 2021-11-02 07:40:33 C 4138
40 2021-11-02 08:06:21 B 4138
41 2021-11-02 08:32:20 C 4138
42 2021-11-02 09:47:26 A 4138
43 2021-11-02 15:51:33 C 4138
44 2021-11-02 16:04:33 B 2433
45 2021-11-02 20:47:01 B 2433
46 2021-11-03 04:11:57 A 4594
47 2021-11-03 12:36:16 A 6829
48 2021-11-03 14:42:14 A 6829
49 2021-11-03 18:03:27 B 7138
50 2021-11-03 18:46:46 C 7138
51 2021-11-03 19:07:01 C 7138
52 2021-11-03 19:23:02 A 9752
53 2021-11-03 21:10:51 A 2699
54 2021-11-04 00:58:12 C 2699
55 2021-11-04 03:44:12 A 7463
56 2021-11-04 05:40:07 C 4558
57 2021-11-04 05:56:51 C 7855
58 2021-11-04 06:27:28 C 7855
59 2021-11-04 07:50:46 C 7855
想要的结果:
Date Group Repetitions
0 2021-10-29 A 1
1 2021-10-29 B 3
2 2021-10-29 C 3
3 2021-10-30 A 2
4 2021-10-30 B 2
5 2021-10-30 C 2
6 2021-10-31 A 4
7 2021-10-31 B 0
8 2021-10-31 C 1
9 2021-11-01 A 1
10 2021-11-01 B 2
11 2021-11-01 C 7
12 2021-11-02 A 1
13 2021-11-02 B 4
14 2021-11-02 C 5
15 2021-11-03 A 3
16 2021-11-03 B 1
17 2021-11-03 C 2
18 2021-11-04 A 0
19 2021-11-04 B 0
20 2021-11-04 C 4
请注意,duplicate 条件跨越日期和组:上例中的 'ID' 2699 被视为重复,即使这些重复属于不同的日期和组。
【问题讨论】:
标签: python pandas dataframe group-by