R中具有多个函数参数的聚合函数答案

【问题标题】：Aggregate function in R with multiple function argumentsR中具有多个函数参数的聚合函数
【发布时间】：2016-05-02 06:53:52
【问题描述】：

我有包含不同季节气候数据的示例数据集：

df <- data.frame(season=rep(1:5,2),year=rep(1:2,each=5),
      temp=c(2,4,3,5,2,4,1,5,4,3),ppt=c(4,3,1,5,6,2,1,2,2,2),
      samples=c(22,25,24,31,31,29,28,31,30,32))

我可以简单地确定每年每个季节的气候变量的平均值：

aggregate(df[,c('temp','ppt')], by = list(df$season,df$year), function(x) mean(x,na.rm=T))

但是，我想使用变量samples 作为我的权重来确定每个季节|年份组合的加权平均值。

基本上我想用weighted.mean 替换aggregate() 中的mean 函数。这需要向我的函数添加第二个参数，该参数需要用我的x 进行更改。

    function(x,w) weighted.mean(x,w,na.rm=T))

不过，我不确定如何让weighted.mean() 的权重参数 ('w') 随聚合数据的每个子集而变化。

我可以在 aggregate 函数中完成这一切吗？

任何建议都会很棒！

【问题讨论】：

标签： r function arguments aggregate

【解决方案1】：

从dplyr 尝试summarise_each。它允许使用group_by 预先分组并应用于多个列：

library(dplyr)
df %>% group_by(season, year) %>%
        summarise_each(funs(weighted.mean(., samples,na.rm=T)), temp,ppt)
# Source: local data frame [10 x 5]
# Groups: season, year [10]
# 
#    season  year  temp   ppt samples
#    (int) (int) (dbl) (dbl)   (dbl)
# 1       1     1     2     4      22
# 2       2     1     4     3      25
# 3       3     1     3     1      24
# 4       4     1     5     5      31
# 5       5     1     2     6      31
# 6       1     2     4     2      29
# 7       2     2     1     1      28
# 8       3     2     5     2      31
# 9       4     2     4     2      30
# 10      5     2     3     2      32

【讨论】：

这可以使用aggregate 或R 基础包中的任何其他函数来完成吗？
我不知道为什么你想要一个复杂的基础解决方案，但你可以。解释需要很长时间才能通过df[,c("temp", "ppt")] <- matrix(ncol=2, unlist(do.call(rbind, lapply(split(df, list(df$season, df$year)),function(df) { lapply(df[,c("temp", "ppt")], function(cols) weighted.mean(cols, df$samples, na.rm=T))}))))