【发布时间】:2020-05-07 05:33:37
【问题描述】:
我有一个数据框,其中包含 id、类别、时间戳、数量、价格等列。我想按 ID、类别对数据进行分组,然后获取数量、价格的最后 3 个值,然后对表格进行透视。
library(dplyr)
dummy <- data.frame("ID" = c(1,1,2,2,3),
"category"=c("A","A", "B", "A", "C"),
"timestamp"=as.Date(c("2020-04-05", "2020-04-10", "2020-03-01", "2020-01-01", "2020-01-10")),
"Quantity"=c(1,5,6,7,4),
"price"=c(10.2, 45.6, 70.3, 23.4, 10))
> dummy
ID category timestamp Quantity price
1 1 A 2020-04-05 1 10.2
2 1 A 2020-04-10 5 45.6
3 2 B 2020-03-01 6 70.3
4 2 A 2020-01-01 7 23.4
5 3 C 2020-01-10 4 10.0
我想选择每个客户类别的最后 3 行。如果只有一到两行 proesnet 则用 0 填充缺失的行。
dummy2 <- data.frame("ID" = c(1,2,2,3),"category" = c("A","B", "A", "C"),
"Quantity1" = c(0,0,0,0),"Quantity2" = c(1,0,0,0),"Quantity3" = c(5,6,7,4),
"price1" = c(0,0,0,0),"price2" = c(10.2,0,0,0),"price3" = c(45.6, 70.3, 23.4, 10.0))
> dummy2
ID category Quantity1 Quantity2 Quantity3 price1 price2 price3
1 1 A 0 1 5 0 10.2 45.6
2 2 B 0 0 6 0 0.0 70.3
3 2 A 0 0 7 0 0.0 23.4
4 3 C 0 0 4 0 0.0 10.0
这里的数量1、数量2、数量3代表每个IDx类别的(last-2、last-1、last)行值。
我试过dummy %>% group_by(ID, category) %>% dplyr::top_n(-3, wt = timestamp) %>% select(Quantity, price) 之后我不知道该怎么办。请提出解决方案
【问题讨论】: