【发布时间】:2017-09-16 11:29:34
【问题描述】:
我正在尝试用ddply 在dplyr 中做一些我已经拥有的事情
这是有效的:
library(plyr)
library(dplyr)
library(matrixStats)
mtcars2 = tbl_df(mtcars) %>%
mutate(car = rownames(mtcars))
# compute the weighted mean (I use cyl just to provide an example)
ddply(mtcars2, .(car), summarise, FUN = matrixStats::weightedMean(mpg, w = cyl, na.rm = TRUE))
# compute the weighted median
ddply(mtcars2, .(car), summarise, FUN = matrixStats::weightedMedian(mpg, w = cyl, na.rm = TRUE))
输出是
> ddply(mtcars2, .(car), summarise, FUN = matrixStats::weightedMean(mpg, w = cyl, na.rm = TRUE))
car FUN
1 AMC Javelin 15.2
2 Cadillac Fleetwood 10.4
3 Camaro Z28 13.3
4 Chrysler Imperial 14.7
5 Datsun 710 22.8
6 Dodge Challenger 15.5
7 Duster 360 14.3
8 Ferrari Dino 19.7
9 Fiat 128 32.4
10 Fiat X1-9 27.3
11 Ford Pantera L 15.8
12 Honda Civic 30.4
13 Hornet 4 Drive 21.4
14 Hornet Sportabout 18.7
15 Lincoln Continental 10.4
16 Lotus Europa 30.4
17 Maserati Bora 15.0
18 Mazda RX4 21.0
19 Mazda RX4 Wag 21.0
20 Merc 230 22.8
21 Merc 240D 24.4
22 Merc 280 19.2
23 Merc 280C 17.8
24 Merc 450SE 16.4
25 Merc 450SL 17.3
26 Merc 450SLC 15.2
27 Pontiac Firebird 19.2
28 Porsche 914-2 26.0
29 Toyota Corolla 33.9
30 Toyota Corona 21.5
31 Valiant 18.1
32 Volvo 142E 21.4
等等……没关系
我需要这样的东西(这行不通,因为不正确):
mtcars3 = tbl_df(mtcars) %>%
mutate(car = rownames(mtcars)) %>%
mutate(weighted_mean_mpg = ddply(mtcars, .(car), summarise, FUN = matrixStats::weightedMean(mpg, w = cyl, na.rm = TRUE))) %>%
mutate(weighted_median_mpg = ddply(mtcars, .(car), summarise, FUN = matrixStats::weightedMedian(mpg, w = cyl, na.rm = TRUE)))
或者换句话说,在 dplyr 语句中传递两个变量(x 和权重向量 w)
非常感谢提前!!
【问题讨论】:
-
假设你已经加载了
dplyr和matrixstats,那么:mtcars %>% rownames_to_column("car") %>% group_by(car) %>% summarise(wted_mean = weightedMean(mpg, w = cyl, na.rm = TRUE))。