【发布时间】:2021-06-27 17:29:04
【问题描述】:
这是我的数据表
res = structure(list(ID = c("0c980", "0c980", "0c980", "91320", "f3750", "1b970", "1b970", "1b970", "1b970"),
datetime = structure(c(1547128003, 1549873907, 1550057899, 1544261100, 1550409081, 1547295708, 1561875112, 1562846678, 1564143917), class = c("POSIXct", "POSIXt"), tzone = "")),
row.names = c(NA, -9L),
class = c("data.table", "data.frame"))
数据是
ID datetime
1: 0c980 2019-01-10 21:46:43
2: 0c980 2019-02-11 16:31:47
3: 0c980 2019-02-13 19:38:19
4: 91320 2018-12-08 17:25:00
5: f3750 2019-02-17 21:11:21
6: 1b970 2019-01-12 20:21:48
7: 1b970 2019-06-30 14:11:52
8: 1b970 2019-07-11 20:04:38
9: 1b970 2019-07-26 20:25:17
我想根据相同 ID 内相邻记录之间的时间间隔对每条记录进行编号。
我创建了这样的函数,
myFun = function(x,interval=7){
if(length(x)==1){
d = 1
}else{
a = difftime(x[-1],x[-length(x)],units = 'days')
b = which(a>=interval)
c = diff(c(0,b,length(x)))
d = rep(x = seq(length(b)+1),time = c)
}
return(list(d))
}
输出是
> res[,.(myFun(datetime)),by=.(ID)]
ID V1
1: 0c980 1,2,2
2: 91320 1
3: f3750 1
4: 1b970 1,2,3,4
我想要的输出是
ID datetime V1
1: 0c980 2019-01-10 21:46:43 1
2: 0c980 2019-02-11 16:31:47 2
3: 0c980 2019-02-13 19:38:19 2
4: 91320 2018-12-08 17:25:00 1
5: f3750 2019-02-17 21:11:21 1
6: 1b970 2019-01-12 20:21:48 1
7: 1b970 2019-06-30 14:11:52 2
8: 1b970 2019-07-11 20:04:38 3
9: 1b970 2019-07-26 20:25:17 4
data.table能否按组计算并返回多行?
如果 data.table 不能,还有其他方法可以解决我的问题吗? tidyverse?
非常感谢!
【问题讨论】:
标签: r data.table tidyverse