【问题标题】:Subset nested list子集嵌套列表
【发布时间】:2019-02-05 19:27:56
【问题描述】:

我有一个嵌套列表,想要过滤多个条件。我知道有人问过类似的问题,但由于某种原因,那里的方法在我的列表中不起作用,..

myList <- list(list(list(FileName = list("05_C13_1.mzML"), Molecule = "Adenine", 
            Adduct = list("2M+H"), cons.Area = list(42158.2196614537))), 
            list(list(FileName = list("05_C13_2.mzML"), Molecule = "Phenylalanine", 
            Adduct = list("2M+H"), cons.Area = list(36879.9850931971))), 
            list(list(FileName = list("10_C13_2.mzML", "10_C13_2.mzML"), 
            Molecule = "Adenine", Adduct = list("M+K", "M+K"), cons.Area = list(
            512368.044002373, 60847.2653549584))))

这是我试过的功能:

get_sublist <- function(lst, group_name) {
                lst[lapply(lst, function(x) x[[1]][[1]]) == group_name]
}

它在以下列表中效果很好,但由于我不明白的原因,我不明白(如果我用 x[[1]] 替换 x[[1]][[1]]),..

ThisListWorks <- list(list(list(group = "a", def = "control")), list(list(group = "b", 
        def = "disease1")))

我的示例所需的输出将是例如:

SubList1 <- get_sublist(myList, "Adenine")

SubList1
list(list(list(FileName = list("05_C13_1.mzML"), Molecule = "Adenine", 
    Adduct = list("2M+H"), cons.Area = list(42158.2196614537))), 
    list(list(FileName = list("10_C13_2.mzML", "10_C13_2.mzML"), 
    Molecule = "Adenine", Adduct = list("M+K", "M+K"), cons.Area = list(
    512368.044002373, 60847.2653549584))))

或:

SubList2 <- get_sublist(myList, "10_C13_2.mzML")

SubList2
list(list(list(FileName = list("10_C13_2.mzML", "10_C13_2.mzML"), 
    Molecule = "Adenine", Adduct = list("M+K", "M+K"), cons.Area = list(
    512368.044002373, 60847.2653549584))))

【问题讨论】:

    标签: r list filter nested subset


    【解决方案1】:

    我认为您使用的索引 (x[[1]][[1]]) 是错误的。它将在FileName 条目中查找Adenine

    您可以将函数更改为更健壮:

    get_sublist <- function(lst, var, group_name) {
      lst[lapply(lst, function(x) x[[1]][[var]]) == group_name]
    }
    

    然后:

    xx <- get_sublist(myList, var = "Molecule", group_name = "Adenine")
    dput(xx)
    list(list(list(FileName = list("05_C13_1.mzML"), Molecule = "Adenine", 
        Adduct = list("2M+H"), cons.Area = list(42158.2196614537))), 
        list(list(FileName = list("10_C13_2.mzML", "10_C13_2.mzML"), 
            Molecule = "Adenine", Adduct = list("M+K", "M+K"), cons.Area = list(
                512368.044002373, 60847.2653549584))))
    

    只要var 级别不是list,它就会起作用。对于您的第二个示例,您有一个额外的级别,然后上述方法将不起作用。

    我认为你的第一个级别在这个问题中没有用,所以我放弃了它并创建了一个递归函数来处理任意数量的级别:

    get_sublist <- function(lst, var, group_name) {
    
      if(!(var %in% names(lst))){
        pos <- sapply(X = lst, FUN = get_sublist, var = var, group_name = group_name)
      } else{
        if(is.list(lst[[var]])){
          values <- unlist(lst[[var]])
        } else{
          values <- lst[[var]]
        }
    
        if(group_name %in% values){
          return(TRUE)
        } else{
          return(FALSE)
        }
      } 
    
      lst[pos]
    }
    

    然后:

    xx <- get_sublist(unlist(myList, recursive = F), var = "Molecule", group_name = "Adenine")
    dput(xx)
    list(list(FileName = list("05_C13_1.mzML"), Molecule = "Adenine", 
        Adduct = list("2M+H"), cons.Area = list(42158.2196614537)), 
        list(FileName = list("10_C13_2.mzML", "10_C13_2.mzML"), Molecule = "Adenine", 
            Adduct = list("M+K", "M+K"), cons.Area = list(512368.044002373, 
                60847.2653549584)))
    

    和,

    yy <- get_sublist(unlist(myList, recursive = F), var = "FileName", group_name = "10_C13_2.mzML")
    dput(yy)
    list(list(FileName = list("10_C13_2.mzML", "10_C13_2.mzML"), 
        Molecule = "Adenine", Adduct = list("M+K", "M+K"), cons.Area = list(
            512368.044002373, 60847.2653549584)))
    

    【讨论】:

      猜你喜欢
      • 2020-11-24
      • 1970-01-01
      • 2020-07-18
      • 2018-09-21
      • 1970-01-01
      • 2021-08-02
      • 2020-12-09
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多