R - 如何使用字符（）元素生成数据框列表答案

【问题标题】：R - How to generate a List of dataframes with character() elementsR - 如何使用字符（）元素生成数据框列表
【发布时间】：2014-01-02 02:18:33
【问题描述】：

如何制作一个 text-section-dataframes 列表editstart（对于给定的文本）editend，其中该部分的每个句子都有一个特定的 ID？（与下面的简单示例不同，句子的数量并不总是相同。）

sent <- character()
id <- character()
section <- data.frame(SentIDs=character(), Sentences=character(), stringsAsFactors=FALSE)
textList <- list(section)

sections <- 3
sentences <- 5

for(i in 1:sections){
   for (j in 1:sentences){
      textList[[i]][j,1] <- paste(i, j, sep=",")      # ID of one sentence is put into the section dataframe inside the textList
      textList[[i]][j,2] <- paste("sent", j, sep=" ") # sentence is put into the section dataframe inside the textList
   }
}
textList

返回错误和错误输出

Error in `*tmp*`[[i]] : subscript out of bounds

> textList
[[1]]
  SentIDs Sentences
1     1,1    sent 1
2     1,2    sent 2
3     1,3    sent 3
4     1,4    sent 4
5     1,5    sent 5

需要的输出

> textList    
[[1]]
  SentIDs Sentences
1     1,1    sent 1
2     1,2    sent 2
3     1,3    sent 3
4     1,4    sent 4
5     1,5    sent 5

[[2]]
  SentIDs Sentences
1     2,1    sent 1
2     2,2    sent 2
3     2,3    sent 3
4     2,4    sent 4
5     2,5    sent 5

[[3]]
  SentIDs Sentences
1     3,1    sent 1
2     3,2    sent 2
3     3,3    sent 3
4     3,4    sent 4
5     3,5    sent 5

谢谢！ :)

【问题讨论】：

标签： r list text dataframe

【解决方案1】：

这是一项出色的replicate 工作。无需使用 for 循环。使用选项simplify=FALSE 允许将列表作为输出。

set.seed(1)
replicate(3,{
        n=sample(1:4,1)   ## random number of rows
        ID = seq_len(n)
        data.frame(ID=ID,sent=paste("sent", ID))},
          simplify=FALSE)

[[1]]
  ID   sent
1  1 sent 1
2  2 sent 2

[[2]]
  ID   sent
1  1 sent 1
2  2 sent 2

[[3]]
  ID   sent
1  1 sent 1
2  2 sent 2
3  3 sent 3

编辑在 OP 澄清后：

你应该在这里使用lapply，因为你有一个列表。还可以使用 seq_along 和 seq_len 函数来创建索引或给出向量长度。

 lapply(seq_along(ll),function(i)
   data.frame(sent=ll[[i]],
              Id=paste(i,seq_along(ll[[i]]),sep=",")))

【讨论】：

A1！非常感谢！我的问题是如何创建一个列表 - 我的错。在我原来的上下文中，我已经有了句子，并且位置实际上是已知的，但我无法生成正确的列表对象。
@alex 你有一个句子列表，你想在多个列表中随机拆分吗？
在原始代码中，我有一个没有 ID 的向量列表（表示部分的元素）（表示句子的元素）。挑战是创建一个类似的列表，但为每个句子添加了 ID。我的问题可能有点奇怪。我在这个领域相对较新。 :) 谢谢！

【解决方案2】：

使用apply 系列函数中的某些内容，您将获得更简洁高效的代码，而不是使用for 循环：

sections <- 3
sentences <- 5
textList <- lapply(1:sections, function(x) {
  data.frame(SentIDs=paste0(x, ",", 1:sentences),
             Sentences=paste("sent", 1:sentences))
})
textList

# [[1]]
#   SentIDs Sentences
# 1     1,1    sent 1
# 2     1,2    sent 2
# 3     1,3    sent 3
# 4     1,4    sent 4
# 5     1,5    sent 5
# 
# [[2]]
#   SentIDs Sentences
# 1     2,1    sent 1
# 2     2,2    sent 2
# 3     2,3    sent 3
# 4     2,4    sent 4
# 5     2,5    sent 5
# 
# [[3]]
#   SentIDs Sentences
# 1     3,1    sent 1
# 2     3,2    sent 2
# 3     3,3    sent 3
# 4     3,4    sent 4
# 5     3,5    sent 5

【讨论】：

A1，非常感谢。该解决方案完美解决了我的问题:)

【解决方案3】：

您需要在循环的外部定义每个section：

sent <- character()
id <- character()
textList <- list()


sections <- 3
sentences <- 5

for(i in 1:sections){
   textList[[i]] <- data.frame(SentIDs=character(), Sentences=character(), stringsAsFactors=FALSE)

   for (j in 1:sentences){
      textList[[i]][j,1] <- paste(i, j, sep=",")      # ID of one sentence is put into the section dataframe inside the textList
      textList[[i]][j,2] <- paste("sent", j, sep=" ") # sentence is put into the section dataframe inside the textList
   }
}
textList

【讨论】：