【问题标题】:Assigning repeated loop output to one data frame将重复循环输出分配给一个数据帧
【发布时间】:2021-01-23 04:52:36
【问题描述】:

我有一个数据框train。它看起来像这样:

> str(train)
Classes ‘data.table’ and 'data.frame':  4096 obs. of  2 variables:
 $ XY   : chr  "-0.253056407416539,0.0284760501437887" "-0.248966337417195,0.0327728517259305" "-0.244876267417851,0.0376197657997918" "-0.240786197418507,0.0430736699343487" ...
 $ Group: chr  "fa05,1" "fa05,1" "fa05,1" "fa05,1" ...
 - attr(*, ".internal.selfref")=<externalptr> 

> head(train)
                                      XY  Group
1: -0.253056407416539,0.0284760501437887 fa05,1
2: -0.248966337417195,0.0327728517259305 fa05,1
3: -0.244876267417851,0.0376197657997918 fa05,1
4: -0.240786197418507,0.0430736699343487 fa05,1
5: -0.236696127419163,0.0492435986076443 fa05,1
6: -0.232606057419819,0.0562149950068869 fa05,1

我编写了代码,按组对 XY 列重新采样,用“,”将列中的值分隔为两个单独的列,将它们转换为数字,然后分别为每个组取 X 和 Y 列的平均值。它运行完美,输出如下所示:

    Group.1         X         Y
1    fa05,0 0.3174567 1.1083954
2    fa05,1 0.2857464 1.0411072
3    fa10,0 0.2987560 1.1765904
4    fa10,1 0.2563579 1.1286934
5    fa20,0 0.3204026 1.0703147
6    fa20,1 0.2597907 1.1629019
7  flatfa,0 0.3191444 1.0399517
8  flatfa,1 0.2532680 1.1957248
9  flatsa,0 0.3252190 1.0506540
10 flatsa,1 0.3124151 0.8458343
11   sa05,0 0.2792419 1.1065144
12   sa05,1 0.2186174 1.2720533
13   sa10,0 0.3071584 1.3031327
14   sa10,1 0.3134321 1.0493272
15   sa20,0 0.3134320 1.1239246
16   sa20,1 0.2919554 1.2797494

现在我尝试在一个循环中实现它,以便它重复 10 次并分配给同一个数据帧。我想出了这个:

boot_means <- data.frame(Group.1 = rep(c(""), each=16*10),
                         X = rep(c(as.numeric("")), each=16*10),
                         Y = rep(c(as.numeric("")), each=16*10))

for (i in 1:10){
  train_resample <- setDT(train)[, .(XY=sample(XY, replace=T)), by = Group]
  train_sep <- train_resample %>% separate(XY, c("X", "Y"), ",") 
  train_sep$X <- as.numeric(train_sep$X)
  train_sep$Y <- as.numeric(train_sep$Y)
  resample_means <- aggregate(train_sep[, 2:3], list(train_sep$Group), mean)
  print(resample_means)
  boot_means[i] <- resample_means
}

它适用于“print(resample_means)”——在这里我得到了预期的输出。但是当我查看 boot_means 时,循环已将 Group 变量分配给所有列。

> head(boot_means)
  Group.1      X      Y
1  fa05,0 fa05,0 fa05,0
2  fa05,1 fa05,1 fa05,1
3  fa10,0 fa10,0 fa10,0
4  fa10,1 fa10,1 fa10,1
5  fa20,0 fa20,0 fa20,0
6  fa20,1 fa20,1 fa20,1

这不是我想要的!你能帮帮我吗?

【问题讨论】:

    标签: r for-loop mean resampling


    【解决方案1】:

    制作boot_means 一个列表并将数据框存储在其中。

    library(data.table)
    
    boot_means <- vector('list', 10)
    
    for (i in 1:10){
      train_resample <- setDT(train)[, .(XY=sample(XY, replace=T)), by = Group]
      train_sep <- train_resample %>% tidyr::separate(XY, c("X", "Y"), ",") 
      train_sep$X <- as.numeric(train_sep$X)
      train_sep$Y <- as.numeric(train_sep$Y)
      resample_means <- aggregate(train_sep[, 2:3], list(train_sep$Group), mean)
      boot_means[[i]] <- resample_means
    }
    #If you want everything in one dataframe.
    combined_data <- rbindlist(boot_means)
    

    【讨论】:

    • 完美运行。非常感谢!
    • @Skårup 很高兴能帮上忙!请随时点击左侧的复选标记accept the answer :-) 每个帖子只能接受一个答案。
    猜你喜欢
    • 1970-01-01
    • 2016-10-05
    • 1970-01-01
    • 2019-02-09
    • 1970-01-01
    • 2013-05-05
    • 2017-12-14
    • 1970-01-01
    • 2013-03-07
    相关资源
    最近更新 更多