【发布时间】:2023-03-20 18:05:01
【问题描述】:
我有一个庞大的数据集,其中包含四列 user_id、action、start_time 和 end_time。我想合并连续的动作"o" 和start_time 将第一个start_time 和end_time 是最后一个合并记录的end_time。
假设df:
"user_id","action","start_time","end_time"
"11","o",23:25:27,23:25:49
"11","o",23:25:28,23:25:28
"11","o",23:25:48,23:26:50
"11","v",23:25:49,23:25:49
"11","v",23:25:49,23:25:50
"11","o",23:28:24,00:22:33
"11","o",00:10:48,00:23:44
"22","o",00:11:52,00:22:33
"22","o",00:22:32,00:27:44
"22","v",00:22:42,00:22:42
"22","o",00:22:42,00:22:42
"22","z",00:22:42,00:22:43
我想合并第 1 行、第 2 行和第 3 行,因为它们都有动作 "o" 并且合并有第一行的 start_time 和第二行的 end_time。这同样适用于行号 6 和 7 以及行号 8 和 9。
所以想要的输出:
"user_id","action","start_time","end_time"
"11","o",23:25:27,23:26:50
"11","v",23:25:49,23:25:49
"11","v",23:25:49,23:25:50
"11","o",23:28:24,00:23:44
"22","o",00:11:52,00:27:44
"22","v",00:22:42,00:22:42
"22","o",00:22:42,00:22:42
"22","z",00:22:42,00:22:43
我如何在 R 中做到这一点? 谢谢
【问题讨论】:
-
我认为您想要的输出中有错误。输入的第 4 行和第 5 行应该合并在一起,所以你想要的输出中的
action值序列应该是:o, v, o, o, v, o, z,我认为