【发布时间】:2019-12-26 03:58:50
【问题描述】:
我有以下包含 8 个独特治疗组的数据集 (dat)。我想从每个唯一组中抽取 3 个点并存储它们的均值和方差。我想使用循环将所有值存储在输出中来执行此操作 1000 次(带替换的示例)。我试着做这个循环,但我一直遇到unexpected '=' in:"output[i] <- summarise(group_by(new_df[i], fertilizer,crop, level),mean[i]="
关于如何修复它的任何建议,或者让它变得更多
fertilizer <- c("N","N","N","N","N","N","N","N","N","N","N","N","P","P","P","P","P","P","P","P","P","P","P","P","N","N","N","N","N","N","N","N","N","N","N","N","P","P","P","P","P","P","P","P","P","P","P","P")
crop <- c("alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group")
level <- c("low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","low")
growth <- c(0,0,1,2,90,5,2,5,8,55,1,90,2,4,66,80,1,90,2,33,56,70,99,100,66,80,1,90,2,33,0,0,1,2,90,5,2,2,5,8,55,1,90,2,4,66,0,0)
dat <- data.frame(fertilizer, crop, level, growth)
library(dplyr)
for(i in 1:1000){
new_df[i] <- dat %>%
group_by(fertilizer, crop, level) %>%
sample_n(3)
output[i] <- summarise(
group_by(new_df[i], fertilizer, crop, level),
mean[i] = mean(growth),
var[i] = sd(growth) * sd(growth))
}
【问题讨论】:
-
summarize(..., mean[i]=...)不好有几个原因:(1)summarize不采用这样的索引赋值(虽然它适用于 REPL 上的简单向量); (2) 我认为,将变量命名为(通用)函数可能是不好的形式,但这只是我的两分钱。主要是第一个。 -
我修正了您代码中的错字,请在提问前检查您提供给我们的代码。
标签: r for-loop dplyr sample resampling