【发布时间】:2022-01-13 12:06:03
【问题描述】:
我有一个包含几列的 data.frame。每列都有不同的“类”。例如:
第 1 列:是一个列表,“id”,包含单独的 7803 个元素。
第 2 列:“位置”是字符(7803 行,每行都是一个字符)。
第 3 列:“等位基因”列表,包含 7803 个元素。
第 4 列: 是一个列表列表,“clinical_significance”包含 7803 个元素,其中每个元素可能包含一到三个元素。
这是一个外观示例:
这是一个带有 dput() 的小子集:
structure(list(id = list("rs1585931494", "rs1253996056", "rs368528867",
"rs397507487", "rs1291775716", "rs1205853831", "rs555976452",
"rs727502904", "rs1481562268"), location = c("1:140734725-140734725",
"1:140734735-140734735", "1:140734742-140734742", "1:140734743-140734743",
"1:140734752-140734752", "1:140734755-140734755", "1:140734758-140734758",
"1:140734763-140734763", "1:140734764-140734764"), alleles = list(
structure(c("G", "A"), .Dim = 2:1), structure(c("C", "A"), .Dim = 2:1),
structure(c("C", "A", "T"), .Dim = c(3L, 1L)), structure(c("G",
"A"), .Dim = 2:1), structure(c("G", "C"), .Dim = 2:1), structure(c("C",
"A"), .Dim = 2:1), structure(c("T", "A", "C"), .Dim = c(3L,
1L)), structure(c("G", "A", "T"), .Dim = c(3L, 1L)), structure(c("C",
"A", "T"), .Dim = c(3L, 1L))), clinical_significance = list(
list(), list(), structure("uncertain significance", .Dim = c(1L,
1L)), list(), list(), list(), list(), structure(c("uncertain significance",
"likely pathogenic"), .Dim = 2:1), structure("likely pathogenic", .Dim = c(1L,
1L))), consequence_type = list("missense_variant", "missense_variant",
"missense_variant", "missense_variant", "missense_variant",
"stop_gained", "missense_variant", "missense_variant", "missense_variant"),
gene_symbol = c("ENSG00000139618", "ENSG00000139618", "ENSG00000139618",
"ENSG00000139618", "ENSG00000139618", "ENSG00000139618",
"ENSG00000139618", "ENSG00000139618", "ENSG00000139618")), row.names = c(3544L,
3545L, 3547L, 3548L, 3550L, 3552L, 3554L, 3556L, 3557L), class = "data.frame")
我想要一个简单的 data.frame,每个 [row,column] 有一个字符值。我在尝试取消列出临床意义列表时特别困难。由于它可能包含多个元素,我只想将它们折叠成一个元素,用逗号分隔。但我无法接近那个。
我尝试了以下解决方案:
do.call(rbind.data.frame, my_df)
Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = default.stringsAsFactors(), :
invalid list argument: all variables should have the same length
# This "apparently" works but when I try to write it as table, it's an error
df <- dplyr::bind_rows(my_df) #or df <- purrr::map_df(my_df, dplyr::bind_rows)
Error in write.table(df) : unimplemented type 'list' in 'EncodeElement'
感谢任何反馈或建议。
【问题讨论】:
-
你能给我们一个小的工作示例,例如前 2 或 5 行吗?
-
欢迎来到 Stack Overflow!您能否阅读并合并来自How to make a great R reproducible example? 的元素。尤其是使用
dput()作为输入的方面,然后是您预期数据集的明确示例? -
是的,对不起!我不知道该怎么做