【发布时间】:2016-10-21 20:20:28
【问题描述】:
我有以下数据框,这个问题与 [this thread] 相关
df = data.frame(c("2012","2012","2012","2013"),
c("AAA","BBB","AAA","AAA"),
c("X","Not-serviced","X","Y"),
c("2","10","3","2.5"))
colnames(df) = c("year","type","service_type","waiting_time")
我想获得服务组和非服务组的平均等待时间。这是数据的分组方式:
library(data.table)
setDT(df)[, .(num_serviced = sum(service_type != "Not-serviced"),
num_notserviced = sum(service_type =="Not_serviced"),
avg_wt = mean(waiting_time)), ## THE PROBLEM HERE!!!
.(year, type)][, Total := num_serviced + num_notserviced][]
但是avg_wt = mean(waiting_time)) 估计的平均等待时间超过了 Total。我宁愿需要avg_wt_serviced 和avg_wt_notserviced。
结果必须是:
year type num_serviced num_notserviced num_total avg_wt_serviced avg_wt_notserviced
2012 AAA 2 0 2 2.5 0
【问题讨论】:
-
@RonakShah:你完全正确。感谢您的关注。 10 指 2012 年和 BBB。如果是 2012 年和 AAA,则为 0。
标签: r