带有列表 R 的 for 循环答案

【问题标题】：for loop with lists R带有列表 R 的 for 循环
【发布时间】：2018-12-27 18:51:01
【问题描述】：

我想在 for 循环中创建两个数据框列表，但我不能使用 assign：

dat <- data.frame(name = c(rep("a", 10), rep("b", 13)),
                  x = c(1,3,4,4,5,3,7,6,5,7,8,6,4,3,9,1,2,3,5,4,6,3,1),
                  y = c(1.1,3.2,4.3,4.1,5.5,3.7,7.2,6.2,5.9,7.3,8.6,6.3,4.2,3.6,9.7,1.1,2.3,3.2,5.7,4.8,6.5,3.3,1.2))

a <- dat[dat$name == "a",]
b <- dat[dat$name == "b",]

samp <- vector(mode = "list", length = 100)
h <- list(a,b)
hname <- c("a", "b")

for (j in 1:length(h)) {
  for (i in 1:100) {
    samp[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
    assign(paste("samp", hname[j], sep="_"), samp[[i]])
  }
}

我得到包含第 100 个样本结果的向量，而不是名为 samp_a 和 samp_b 的列表。我想得到一个列表samp_a 和samp_b，其中包含dat[dat$name == a,] 和dat[dat$name == a,] 的所有不同样本。

我该怎么做？

【问题讨论】：

为什么要这样做？保持工作空间结构化几乎（可能总是）会更好，创建 100 个编号的对象是没有意义的。
因为我需要 100 个随机抽取的样本来测试它们的某些功能。这个循环在一个函数内，所以我的工作空间将是结构化的，我猜......
你的samp 列表中有它们，为什么需要将它们分配给工作区中的不同对象？无论您想将它们提供给不同的模型，还是将它们保存到不同的文件，或者使用 lapply 或 for 循环遍历它们，所有这些都可以通过将它们保存在一个列表中来完成
对我来说，当我将两个列表放在一个列表中时，代码的含义更复杂。

标签： r list for-loop assign

【解决方案1】：

如何创建两个不同的列表并避免使用分配：

Option 1:

# create empty list
samp_a <-list()
samp_b <- list()

for (j in seq(h)) {

    # fill samp_a list
    if(j == 1){
        for (i in 1:100) {
            samp_a[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
        }
      # fill samp_b list
    } else if(j == 2){
        for (i in 1:100) {
            samp_b[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
        }
    }
}

你也可以使用assign，答案更短：

Option 2:

for (j in seq(hname)) {
    l = list()
    for (i in 1:100) {
        l[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
    }
    assign(paste0('samp_', hname[j]), l)
    rm(l)
}

【讨论】：

OP 的for-loop 可能不符合他们的要求，但如果我错了，请纠正我。这个循环所做的就是从 1:10（或 1:13）中抽取 5 个（或 6 个）数字的 100 个样本。 OP 想要 a 和 b 的配对 x,y 样本。您需要将sample 调用更改为samp_a[[i]] <- h[[j]][sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5), ]。除非 OP 想要一个随机的 x 与一个随机的 y 配对。
@Anonymouscoward 可能你是对的，让 OP 确认。
@YOLO 我认为代码是正确的。每个名称需要 100 个样本，并将它们分配给列表 samp_a 和 samp_b。它实际上分配了行的索引，因此如果我将索引应用于我的数据框，则会绘制样本。还是我错过了什么？
不，你是对的，既然你想要索引，那就对了

【解决方案2】：

您可以通过rep 函数轻松地为此使用lapply。除非你想要一个随机的x，与一个随机的y 配对。这将保持现有的配对顺序。

dat <- data.frame(name = c(rep("a", 10), rep("b", 13)),
              x = c(1,3,4,4,5,3,7,6,5,7,8,6,4,3,9,1,2,3,5,4,6,3,1),
              y = c(1.1,3.2,4.3,4.1,5.5,3.7,7.2,6.2,5.9,7.3,8.6,6.3,4.2,3.6,9.7,1.1,2.3,3.2,5.7,4.8,6.5,3.3,1.2))

a <- dat[dat$name == "a",]
b <- dat[dat$name == "b",]

h <- list(a,b)
hname <- c("a", "b")

testfunc <- function(df) {
#df[sample(nrow(df), nrow(df)*0.5), ] #gives you the values in your data frame
sample(nrow(df), nrow(df)*0.5) # just gives you the indices
}

lapply(h, testfunc) # This gives you the standard lapply format, and only gives one a, and one b
samp <- lapply(rep(h, 100), testfunc) # This shows you how to replicate the function n times, giving you 100 a and 100 b data.frames in a list

samp_a <- samp[c(TRUE, FALSE)] # Applies a repeating T/F vector, selecting the odd data.frames, which in this case are the `a` frames.
samp_b <- samp[c(FALSE, TRUE)] # And here, the even data.frames, which are the `b` frames.

【讨论】：