【发布时间】:2021-04-08 08:20:52
【问题描述】:
我想使用 DISPENSED_DURATION 列中的最后一个非 NA 值并将其添加到 DISPENSED_DATE 列以获取每个 ID 组中的最后一个 LAST_DATE 列。
我目前正在查看类似 copy[,.SD[.N],ID] 的内容以获取最后一行,但不知道如何跳过这些 NA,然后将其添加回 DISPENSED_DATE。
这里是示例代码:
dt = data.table(
ID = c(1,1,1,1,1,2,2,2,2,2),
DATE = c("2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-05", "2020-01-06","2020-01-07","2020-01-08","2020-01-09","2020-01-10"),
PRESCRIBED_DATE = c("2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-05", "2020-01-06","2020-01-07","2020-01-08", NA,"2020-01-10"),
DISPENSED_DATE = c("2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-05", "2020-01-06","2020-01-07","2020-01-08", "2020-01-09","NA"),
DISPENSED_DURATION = c(5,5,5,5,5,6,6,6,6,NA)
)
ID PRESCRIBED_DATE DISPENSED_DATE DISPENSED_DURATION
1: 1 2020-01-01 2020-01-01 5
2: 1 2020-01-02 2020-01-02 5
3: 1 2020-01-03 2020-01-03 5
4: 1 2020-01-04 2020-01-04 5
5: 1 2020-01-05 2020-01-05 5
6: 2 2020-01-06 2020-01-06 6
7: 2 2020-01-07 2020-01-07 6
8: 2 2020-01-08 2020-01-08 6
9: 2 2020-01-09 2020-01-09 6
10: 2 2020-01-10 <NA> NA
预期结果:
ID PRESCRIBED_DATE DISPENSED_DATE DISPENSED_DURATION LAST_DATE
1: 1 2020-01-01 2020-01-01 5 <NA>
2: 1 2020-01-02 2020-01-02 5 <NA>
3: 1 2020-01-03 2020-01-03 5 <NA>
4: 1 2020-01-04 2020-01-04 5 <NA>
5: 1 2020-01-05 2020-01-05 5 2020-01-10
6: 2 2020-01-06 2020-01-06 6 <NA>
7: 2 2020-01-07 2020-01-07 6 <NA>
8: 2 2020-01-08 2020-01-08 6 <NA>
9: 2 2020-01-09 2020-01-09 6 2020-01-15
10: 2 2020-01-10 <NA> NA <NA>
谢谢!
【问题讨论】:
标签: r dplyr data.table