【问题标题】:Compute weighted mean/median with dplyr as with plyr与 plyr 一样使用 dplyr 计算加权平均值/中位数
【发布时间】:2017-09-16 11:29:34
【问题描述】:

我正在尝试用ddplydplyr 中做一些我已经拥有的事情

这是有效的:

library(plyr)
library(dplyr)
library(matrixStats)


mtcars2 = tbl_df(mtcars) %>% 
  mutate(car = rownames(mtcars))

# compute the weighted mean (I use cyl just to provide an example)
ddply(mtcars2, .(car), summarise, FUN = matrixStats::weightedMean(mpg, w = cyl, na.rm = TRUE))

# compute the weighted median
ddply(mtcars2, .(car), summarise, FUN = matrixStats::weightedMedian(mpg, w = cyl, na.rm = TRUE))

输出是

> ddply(mtcars2, .(car), summarise, FUN = matrixStats::weightedMean(mpg, w = cyl, na.rm = TRUE))
                   car  FUN
1          AMC Javelin 15.2
2   Cadillac Fleetwood 10.4
3           Camaro Z28 13.3
4    Chrysler Imperial 14.7
5           Datsun 710 22.8
6     Dodge Challenger 15.5
7           Duster 360 14.3
8         Ferrari Dino 19.7
9             Fiat 128 32.4
10           Fiat X1-9 27.3
11      Ford Pantera L 15.8
12         Honda Civic 30.4
13      Hornet 4 Drive 21.4
14   Hornet Sportabout 18.7
15 Lincoln Continental 10.4
16        Lotus Europa 30.4
17       Maserati Bora 15.0
18           Mazda RX4 21.0
19       Mazda RX4 Wag 21.0
20            Merc 230 22.8
21           Merc 240D 24.4
22            Merc 280 19.2
23           Merc 280C 17.8
24          Merc 450SE 16.4
25          Merc 450SL 17.3
26         Merc 450SLC 15.2
27    Pontiac Firebird 19.2
28       Porsche 914-2 26.0
29      Toyota Corolla 33.9
30       Toyota Corona 21.5
31             Valiant 18.1
32          Volvo 142E 21.4

等等……没关系

我需要这样的东西(这行不通,因为不正确):

mtcars3 = tbl_df(mtcars) %>% 
  mutate(car = rownames(mtcars)) %>% 
  mutate(weighted_mean_mpg = ddply(mtcars, .(car), summarise, FUN = matrixStats::weightedMean(mpg, w = cyl, na.rm = TRUE))) %>% 
  mutate(weighted_median_mpg = ddply(mtcars, .(car), summarise, FUN = matrixStats::weightedMedian(mpg, w = cyl, na.rm = TRUE)))

或者换句话说,在 dplyr 语句中传递两个变量(x 和权重向量 w

非常感谢提前!!

【问题讨论】:

  • 假设你已经加载了dplyrmatrixstats,那么:mtcars %>% rownames_to_column("car") %>% group_by(car) %>% summarise(wted_mean = weightedMean(mpg, w = cyl, na.rm = TRUE))

标签: r dplyr plyr


【解决方案1】:
x <- as_tibble(mtcars) %>% rownames_to_column(var = 'car')

x %>% group_by(car) %>% summarise(m = mean(mpg, wt = cyl)) %>% knitr::kable()

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-02-12
    • 2021-11-24
    • 2016-02-12
    • 2021-09-10
    • 2018-03-21
    • 2018-10-04
    相关资源
    最近更新 更多