了解如何在 R 中读取嵌套列表以访问数据答案

【问题标题】：Understanding how to read nested lists in R in order to access the data了解如何在 R 中读取嵌套列表以访问数据
【发布时间】：2015-10-11 15:10:52
【问题描述】：

这是我的列表列表的设置： read.GenBank函数请看开发代码here

#must be connected to internet for this to work
library(ape)
library(plyr)
gi_sample<-c(336087836, 336087835, 336087834)
#use the read.GenBank function with apply because we ultimately have more than the max at one time (400)
my_output <- apply(gi_sample, 1, function(x) read.GenBank(x))

str(my_output)
List of 3
 $ :List of 1
  ..$ 336087836:Class 'DNAbin'  raw [1:606] 00 00 00 00 ...
  ..- attr(*, "class")= chr "DNAbin"
  ..- attr(*, "species")= chr "Flavobacterium_johnsoniae"
  ..- attr(*, "references")=List of 1
  .. ..$ 336087836:List of 2
  .. .. ..$ :List of 4
  .. .. .. ..$ pubmedid: chr(0) 
  .. .. .. ..$ authors : chr "Rusznyak,A., Akob,D.M., Nietzsche,S., Eusterhues,K., Totsche,K.U., Neu,T.R., Frosch,T., Popp,J., Keiner,R., Geletneky,J., Katzs"| __truncated__
  .. .. .. ..$ title   : chr "Calcite mineralization by karstic cave bacteria"
  .. .. .. ..$ journal : chr "Unpublished"
  .. .. ..$ :List of 4
  .. .. .. ..$ pubmedid: chr(0) 
  .. .. .. ..$ authors : chr "Rusznyak,A."
  .. .. .. ..$ title   : chr "Direct Submission"
  .. .. .. ..$ journal : chr "Submitted (13-APR-2011) to the INSDC. Institute of Ecology, Aquatic"
 $ :List of 1
  ..$ 336087835:Class 'DNAbin'  raw [1:991] 00 00 00 00 ...
  ..- attr(*, "class")= chr "DNAbin"
  ..- attr(*, "species")= chr "Rhodococcus_fascians"
  ..- attr(*, "references")=List of 1
  .. ..$ 336087835:List of 2
  .. .. ..$ :List of 4
  .. .. .. ..$ pubmedid: chr(0) 
  .. .. .. ..$ authors : chr "Rusznyak,A., Akob,D.M., Nietzsche,S., Eusterhues,K., Totsche,K.U., Neu,T.R., Frosch,T., Popp,J., Keiner,R., Geletneky,J., Katzs"| __truncated__
  .. .. .. ..$ title   : chr "Calcite mineralization by karstic cave bacteria"
  .. .. .. ..$ journal : chr "Unpublished"
  .. .. ..$ :List of 4
  .. .. .. ..$ pubmedid: chr(0) 
  .. .. .. ..$ authors : chr "Rusznyak,A."
  .. .. .. ..$ title   : chr "Direct Submission"
  .. .. .. ..$ journal : chr "Submitted (13-APR-2011) to the INSDC. Institute of Ecology, Aquatic"
 $ :List of 1
  ..$ 336087834:Class 'DNAbin'  raw [1:690] 00 00 00 00 ...
  ..- attr(*, "class")= chr "DNAbin"
  ..- attr(*, "species")= chr "Serratia_plymuthica"
  ..- attr(*, "references")=List of 1
  .. ..$ 336087834:List of 2
  .. .. ..$ :List of 4
  .. .. .. ..$ pubmedid: chr(0) 
  .. .. .. ..$ authors : chr "Rusznyak,A., Akob,D.M., Nietzsche,S., Eusterhues,K., Totsche,K.U., Neu,T.R., Frosch,T., Popp,J., Keiner,R., Geletneky,J., Katzs"| __truncated__
  .. .. .. ..$ title   : chr "Calcite mineralization by karstic cave bacteria"
  .. .. .. ..$ journal : chr "Unpublished"
  .. .. ..$ :List of 4
  .. .. .. ..$ pubmedid: chr(0) 
  .. .. .. ..$ authors : chr "Rusznyak,A."
  .. .. .. ..$ title   : chr "Direct Submission"
  .. .. .. ..$ journal : chr "Submitted (13-APR-2011) to the INSDC. Institute of Ecology, Aquatic"

我想从这个列表中得到什么：

GI  authors     title       journal 
336087836   "Rusznyak,A., Akob,D.M., Nietzsche,S., Eusterhues,K., Totsche,K.U., Neu,T.R., Frosch,T., Popp,J., Keiner,R., Geletneky,J., Katzs"| __truncated__    "Calcite mineralization by karstic cave bacteria"   "Unpublished"
336087835   "Rusznyak,A., Akob,D.M., Nietzsche,S., Eusterhues,K., Totsche,K.U., Neu,T.R., Frosch,T., Popp,J., Keiner,R., Geletneky,J., Katzs"| __truncated__    "Calcite mineralization by karstic cave bacteria"   "Unpublished"
336087834   "Rusznyak,A., Akob,D.M., Nietzsche,S., Eusterhues,K., Totsche,K.U., Neu,T.R., Frosch,T., Popp,J., Keiner,R., Geletneky,J., Katzs"| __truncated__    "Calcite mineralization by karstic cave bacteria"   "Unpublished"

我非常需要解释这些列表是如何嵌套的。如何访问“标题”并保留每个列表的名称？我已经修改了各种“[]”子集组合，最终不明白如何阅读这个列表。我已经阅读了很多beginner explanations，但仍然不知所措。

这与之前的问题有所不同，尽管数据保持不变。

谢谢！

【问题讨论】：

dput 比 str 有用得多。
类似as.data.frame(do.call(c, seqs)) 或包含unlist() 的一些变体可能适合您。 rapply() 可能是必要的，dput() 是必不可少的。
您总是可以只给出一个代表完整数据的较小样本。我们不需要完整列表
从rlist库中查看?list.flatten

标签： r list

【解决方案1】：

听起来dput 的数据很难拥有。我建立了一个可能有用的示例（只是一个答案，因为评论太长了）。

a <- list(list(1:5), list(5:10))
b <- list(list(letters[1:5]), list(LETTERS[5:10]))
data.frame(unlist(a), unlist(b))
   unlist.a. unlist.b.
1          1         a
2          2         b
3          3         c
4          4         d
5          5         e
6          5         E
7          6         F
8          7         G
9          8         H
10         9         I
11        10         J

【讨论】：

【解决方案2】：

使用来自dplyr 的bind_rows 的这个答案应该可以工作，但我无法在您的数据上对其进行测试，因为即使使用ape 的开发版本，您提供的代码也没有给我引用属性。

library("dplyr")
bind_rows(lapply(seq(refs), function(i) data.frame(GI = names(refs)[i], as.data.frame(refs[[i]][lengths(refs[[i]]) > 0]))))

【讨论】：

谢谢，这仍然告诉我“参数意味着不同的行数：0、1”。在问题的顶部，我链接到 github 以获取提供参考的函数的开发版本。
try1