【问题标题】:Weighted Mean by Date按日期加权平均
【发布时间】:2013-07-13 00:23:51
【问题描述】:

我有以下数据框:

df = data.frame(date = c("26/06/2013", "26/06/2013",  "26/06/2013",  "27/06/2013", "27/06/2013", "27/06/2013", "28/06/2013", "28/06/2013",   "28/06/2013"), return = c(".51", ".32", ".34", ".39", "1.1", "3.2", "2.1", "5.3", "2.1"), cap = c("500", "235", "392", "213", "134", "144", "232", "155", "213"), weight = c("0.443655723", "0.20851819", "0.347826087", "0.433808554", "0.272912424", "0.293279022", "0.386666667", "0.258333333", "0.355"))

我要计算:

1) 最后一列“重量”。这是每天“上限”列的权重。

2) 每天“回报”的加权“上限”平均值。我想得到以下输出:

result = data.frame(date = c("26/06/2013", "27/06/2013", "28/06/2013"), cap.weight.mean = c("0.411251109", "1.407881874", "2.926666667"))

【问题讨论】:

  • 您好,欢迎来到 SO。您能否详细说明您的问题。具体来说,“最后一列重量”是什么意思? weight 不是 df 的最后一列。另外,“加权平均回报率”是什么意思?

标签: r mean weighted-average


【解决方案1】:

使用 plyr 函数的另一种可能性:

library(plyr)
# Change factor to numeric
> df[,-1]<-sapply(df[,-1],function(x){as.numeric(as.character(x))})
> ddply(df,.(date),summarize,cap.weight.mean=weighted.mean(return,weight))
        date cap.weight.mean
1 26/06/2013       0.4112511
2 27/06/2013       1.4078819
3 28/06/2013       2.9266667

【讨论】:

    【解决方案2】:

    如有必要,先将因子更改为数字

    df$return=as.numeric(levels(df$return))[df$return]
    df$cap=as.numeric(levels(df$cap))[df$cap]
    df$weight=as.numeric(levels(df$weight))[df$weight]
    

    问题 1)

     library(plyr)
     #pretend weight column were absent in df
     ddply(df[,-ncol(df)],"date",function(x) data.frame(x,weight=x$cap/sum(x$cap)))
    

    问题 2)

     library(plyr)
     ddply(df,"date",function(x) data.frame(date=x$date[1],cap.weight.mean=sum(x$cap*x$return)/sum(x$cap)))
    

    【讨论】:

      【解决方案3】:

      这是另一个使用by的选项!

      在转换为 cryo111 提到的数字后。

      R> by(df, df$date, FUN = function(x) weighted.mean(x$return, w = x$weight) )
      df$date: 26/06/2013
      [1] 0.4112511
      ------------------------------------------------------------ 
      df$date: 27/06/2013
      [1] 1.407882
      ------------------------------------------------------------ 
      df$date: 28/06/2013
      [1] 2.926667
      

      这会在您的result data.frame 中生成信息。我猜这就是你要找的东西

      这是另一个使用memisc:::aggregate.formula的解决方案

      > library(memisc)
      > aggregate(weighted.mean(return, weight) ~ date, data = df)
      >        date weighted.mean(return, weight)
      1 26/06/2013                     0.4112511
      4 27/06/2013                     1.4078819
      7 28/06/2013                     2.9266667
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2023-03-17
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-05-30
        • 2012-12-18
        相关资源
        最近更新 更多