【问题标题】:rapply over a nested list in Rrapply 在 R 中的嵌套列表上
【发布时间】:2013-05-11 20:43:38
【问题描述】:

我在嵌套列表上演奏时遇到了麻烦。这是列表中一个元素的样本结构:

$ F01    :List of 7
  ..$ 0:'data.frame':   16 obs. of  3 variables:
  .. ..$ lengths: Factor w/ 8 levels "1","2","4","5",..: 1 2 3 4 5 6 7 8 1 2 ...
  .. ..$ values : Factor w/ 2 levels "C","N": 1 1 1 1 1 1 1 1 2 2 ...
  .. ..$ Freq   : int [1:16] 1 2 0 1 1 1 1 0 1 3 ...
  ..$ 1:'data.frame':   20 obs. of  3 variables:
  .. ..$ lengths: Factor w/ 10 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
  .. ..$ values : Factor w/ 2 levels "C","N": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ Freq   : int [1:20] 0 1 1 1 1 0 1 0 1 1 ...

我可以使用 lapply 轻松地将函数应用于列表的一个元素:比如 F01

 lapply(data$F01,function(x) x[which(x[['values']]=="C"),])

然后我想用rapply 将其应用于整个嵌套列表:

rapply(data,function(x) x[which(x[['values']]=="C"),],how="list")
Error in `[[.default`(x, "values") : subscript out of bounds

我不明白为什么会出现此 rapply 错误,因为 rapply 应该递归地应用于非列表元素,在本例中为 data.frame。有什么我不明白的明显吗?

这是主列表的两个完整元素的示例:

samp <- list(structure(list(`0` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("1", 
    "2", "7", "8", "13", "18"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(0L, 1L, 1L, 1L, 1L, 0L, 2L, 
    0L, 0L, 0L, 0L, 1L)), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -12L), class = "data.frame"), `1` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("1", 
    "2", "3", "5", "8", "12"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(1L, 1L, 0L, 1L, 1L, 1L, 2L, 
    0L, 1L, 1L, 0L, 0L)), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -12L), class = "data.frame"), `2` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("1", 
    "3", "4", "6", "9", "19", "20"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(1L, 1L, 1L, 1L, 0L, 1L, 0L, 
    0L, 0L, 3L, 0L, 1L, 0L, 2L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -14L), class = "data.frame"), `3` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 
        2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("1", "2", "3", "4", 
        "5", "8", "11", "18"), class = "factor"), values = structure(c(1L, 
        1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
        ), .Label = c("C", "N"), class = "factor"), Freq = c(1L, 
        2L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 2L, 1L, 1L, 1L, 0L, 0L, 1L
        )), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -16L), class = "data.frame"), `4` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("1", 
    "2", "3", "4", "6", "11", "13"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(0L, 2L, 0L, 1L, 1L, 0L, 2L, 
    1L, 2L, 2L, 0L, 0L, 1L, 0L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -14L), class = "data.frame"), `5` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
        1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1", "2", 
        "4", "5", "6", "7", "8", "11", "23"), class = "factor"), 
        values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), 
        Freq = c(0L, 3L, 1L, 2L, 0L, 1L, 0L, 0L, 1L, 3L, 2L, 0L, 
        0L, 1L, 0L, 1L, 1L, 0L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -18L), class = "data.frame"), `6` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
        10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L), .Label = c("1", 
        "2", "3", "4", "5", "6", "9", "12", "13", "21", "36"), class = "factor"), 
        values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
        1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
        "N"), class = "factor"), Freq = c(2L, 2L, 3L, 1L, 2L, 1L, 
        2L, 1L, 0L, 0L, 0L, 2L, 3L, 1L, 4L, 0L, 1L, 0L, 0L, 1L, 1L, 
        1L)), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -22L), class = "data.frame")), .Names = c("0", "1", "2", "3", 
    "4", "5", "6")), structure(list(`0` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("2", "13", "17", "25"
    ), class = "factor"), values = structure(c(1L, 1L, 1L, 1L, 2L, 
    2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), Freq = c(1L, 
    1L, 0L, 1L, 0L, 0L, 1L, 1L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -8L), class = "data.frame"), `1` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 
        4L, 5L, 6L), .Label = c("1", "2", "3", "4", "5", "8"), class = "factor"), 
        values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
        2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), Freq = c(0L, 
        0L, 1L, 2L, 2L, 0L, 1L, 1L, 0L, 1L, 1L, 1L)), .Names = c("lengths", 
    "values", "Freq"), row.names = c(NA, -12L), class = "data.frame"), 
        `2` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("2", 
        "3", "4", "7", "14", "18", "19"), class = "factor"), values = structure(c(1L, 
        1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
        "N"), class = "factor"), Freq = c(1L, 1L, 2L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L)), .Names = c("lengths", "values", 
        "Freq"), row.names = c(NA, -14L), class = "data.frame"), 
        `3` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("2", 
        "3", "5", "8", "9", "10", "19", "76"), class = "factor"), 
            values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
            2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), 
            Freq = c(1L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
            1L, 1L, 0L, 1L, 1L)), .Names = c("lengths", "values", 
        "Freq"), row.names = c(NA, -16L), class = "data.frame"), 
        `4` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("2", 
        "5", "7", "8", "9", "16", "35"), class = "factor"), values = structure(c(1L, 
        1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
        "N"), class = "factor"), Freq = c(1L, 1L, 2L, 0L, 1L, 0L, 
        0L, 1L, 0L, 0L, 2L, 0L, 1L, 1L)), .Names = c("lengths", "values", 
        "Freq"), row.names = c(NA, -14L), class = "data.frame"), 
        `5` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1", 
        "2", "3", "5", "6", "10", "11", "14", "27"), class = "factor"), 
            values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
            1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
            "N"), class = "factor"), Freq = c(2L, 2L, 1L, 1L, 1L, 
            1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L)), .Names = c("lengths", 
        "values", "Freq"), row.names = c(NA, -18L), class = "data.frame"), 
        `6` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1", 
        "2", "3", "4", "5", "6", "11", "21", "51"), class = "factor"), 
            values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
            1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
            "N"), class = "factor"), Freq = c(2L, 1L, 2L, 2L, 1L, 
            1L, 0L, 0L, 0L, 3L, 0L, 2L, 0L, 1L, 1L, 1L, 1L, 1L)), .Names = c("lengths", 
        "values", "Freq"), row.names = c(NA, -18L), class = "data.frame")), .Names = c("0", 
    "1", "2", "3", "4", "5", "6")))

【问题讨论】:

  • 如果您的数据是这样的,为什么 samp 不是嵌套列表?当您的list 元素是data.frames 时,您将无法使用rapply,因为data.frames 是lists,因此rapply 将遍历列。这就是你收到错误的原因。它试图从您的data.frames 中的每一列中获取“值”项。
  • 在你的表达式中使用lapply而不是rapply
  • @eddi,我认为 Chargaff 只是发布了错误的示例数据。如果你看OP顶部的str,它显然是嵌套的
  • @MatthewPlourde,samp 只是嵌套列表的第一个元素,所以它不是嵌套的。如果它令人困惑,我可以删除它。
  • @Chargaff,这不仅仅是 令人困惑,而是这样一条信息至关重要。这里的整个问题是列表的深度。发布一个孩子而不解释它是一个孩子并不能真正帮助任何人帮助

标签: r lapply


【解决方案1】:

我不相信你真的想在这里使用rapply,因为你似乎不想要总递归。也就是说,您并没有尝试将函数应用于lengths,然后再应用于values,等等。

相反,只需尝试两个嵌套的 lapply

 lapply(dat, lapply, function(x) x[which(x[['values']]=="C"),])

【讨论】:

  • 感谢您的回答。我得到与 rapply 相同的错误。你是对的,我不是试图将函数应用于长度和值,但不需要 rapply 递归地应用于嵌套列表?
  • @Chargaff,错误只是告诉您没有名为 values 的元素用于该 x。换句话说,你处于错误的水平。上面的代码适用于dat &lt;- list(samp, samp)。我建议编辑 OP 而不是 samp 使用您的实际 dat (或 dput(dat[1:2])
  • 我编辑了这个问题。是的,您的代码在 samp 数据上按预期工作,我只是不知道为什么它没有通过我的实际数据集,因为结构完全相似......我会检查一下。
  • @Chargaff,我复制并粘贴了您编辑的新数据。然后我复制并粘贴了我在这里的代码。我没有收到任何错误。
猜你喜欢
  • 2013-08-01
  • 1970-01-01
  • 2021-12-18
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2013-05-12
  • 2015-07-01
相关资源
最近更新 更多