【发布时间】:2021-07-24 07:17:59
【问题描述】:
我有以下df:
df<-data.frame(geokey=c("A","A","A","A","A","A","B","B","B","B","B","B"),
upc=c("100","100","101","101","102","102","200","200","201","201",
"202","202"),
endwk=c("14-07-2021","21-07-2021","14-07-2021","21-07-2021","14-07-2021","21-07-2021",
"14-07-2021","21-07-2021","14-07-2021","21-07-2021","14-07-2021","21-07-2021"),
Base_units=c(2,3,1,2,4,1,1,4,2,3,3,2),
Base_price=c(0.1,0.2,0.2,0.1,0.1,0.1,0.2,0.3,0.4,0.1,0.2,0.3),
Incr_units=c(2,1,1,1,2,1,3,2,2,3,1,1),
incr_price=c(0.1,0.1,0.1,0.3,0.2,0.1,0.1,0.2,0.1,0.2,0.1,0.2))
> df
geokey upc endwk Base_units Base_price Incr_units incr_price
1 A 100 14-07-2021 2 0.1 2 0.1
2 A 100 21-07-2021 3 0.2 1 0.1
3 A 101 14-07-2021 1 0.2 1 0.1
4 A 101 21-07-2021 2 0.1 1 0.3
5 A 102 14-07-2021 4 0.1 2 0.2
6 A 102 21-07-2021 1 0.1 1 0.1
7 B 200 14-07-2021 1 0.2 3 0.1
8 B 200 21-07-2021 4 0.3 2 0.2
9 B 201 14-07-2021 2 0.4 2 0.1
10 B 201 21-07-2021 3 0.1 3 0.2
11 B 202 14-07-2021 3 0.2 1 0.1
12 B 202 21-07-2021 2 0.3 1 0.2
预期输出---> 按 geokey--upc---endwk 分组,所有 vol cols 合计(添加)和价格列平均如下所示:
df_merged<-data.frame(geokey=c("A","A","B","B"),
upc=c("upc_100_101_102","upc_100_101_102","upc_200_201_202","upc_200_201_202"),
endwk=c("14-07-2021","21-07-2021","14-07-2021","21-07-2021"),
Base_units_totalled=c(7,6,6,9),
Base_price_averaged=c(0.133,0.133,0.2667,0.2333),
Incr_units_totalled=c(5,3,3,6),
incr_price_averaged=c(0.1333,0.1,0.1,0.2))
> df_merged
geokey upc endwk Base_units_totalled Base_price_averaged Incr_units_totalled incr_price_averaged
1 A upc_100_101_102 14-07-2021 7 0.1330 5 0.1333
2 A upc_100_101_102 21-07-2021 6 0.1330 3 0.1000
3 B upc_200_201_202 14-07-2021 6 0.2667 3 0.1000
4 B upc_200_201_202 21-07-2021 9 0.2333 6 0.2000
我们将不胜感激。
【问题讨论】: