【问题标题】:R: Extract list columns based on column names and patternsR:根据列名和模式提取列表列
【发布时间】:2015-09-30 02:18:08
【问题描述】:

我有一个列表(这里只有示例数据)

my_list <- list(structure(list(sample = c(2L, 6L), data1 = c(56L, 78L), 
    data2 = c(59L, 27L), data3 = c(90L, 28L), data1namet = structure(c(1L, 
    1L), .Label = "Sam1", class = "factor"), data2namab = structure(c(1L, 
    1L), .Label = "Test2", class = "factor"), dataame = structure(c(1L, 
    1L), .Label = "Ex3", class = "factor"), ma = c("Jay", "Jay"
    )), .Names = c("sample", "data1", "data2", "data3", "data1namet", 
"data2namab", "dataame", "ma"), row.names = c(NA, -2L), class = "data.frame"), 
    structure(list(sample = c(12L, 13L, 17L), data1 = c(56L, 
    78L, 3L), data2 = c(59L, 27L, 2L), datest = structure(c(1L, 
    1L, 1L), .Label = "Exa9", class = "factor"), dattestr = structure(c(1L, 
    1L, 1L), .Label = "cz1", class = "factor"), add = c(2, 2, 
    2)), .Names = c("sample", "data1", "data2", "datest", "dattestr", 
    "add"), row.names = c(NA, -3L), class = "data.frame"))

my_list
[[1]]
  sample data1 data2 data3 data1namet data2namab dataame  ma
1      2    56    59    90       Sam1      Test2     Ex3 Jay
2      6    78    27    28       Sam1      Test2     Ex3 Jay

[[2]]
  sample data1 data2 datest dattestr add
1     12    56    59   Exa9      cz1   2
2     13    78    27   Exa9      cz1   2
3     17     3     2   Exa9      cz1   2

我有两个问题: 我想根据列名的模式提取此列表中的列,例如所有列名中包含“数据”一词的列。我无法通过 grep 找到解决方案。

我知道如何根据索引号提取一列(参见下面的示例),但我如何直接根据列名(而不是列号)进行此选择?

out <- lapply(my_list, `[`, 1) # extract "sample" column

【问题讨论】:

    标签: r list indexing extract


    【解决方案1】:

    试试

    lapply(my_list, function(df) df[, grep("data", names(df), fixed = TRUE)] )
    # [[1]]
    # data1 data2 data3 data1namet data2namab dataame
    # 1    56    59    90       Sam1      Test2     Ex3
    # 2    78    27    28       Sam1      Test2     Ex3
    # 
    # [[2]]
    # data1 data2
    # 1    56    59
    # 2    78    27
    # 3     3     2
    
    lapply(my_list, "[", "sample")
    # [[1]]
    # sample
    # 1      2
    # 2      6
    # 
    # [[2]]
    # sample
    # 1     12
    # 2     13
    # 3     17
    

    【讨论】:

    • 你在这里使用fixed=TRUE有什么原因吗?
    • 没有具体原因。只是在寻找固定字符串时习惯了它 - 在这种情况下 - “data”。表现! ;-)
    • 谢谢,这很有帮助。有没有办法可以进行多项选择,例如选择所有包含“样本”的列以及所有列名中包含“数据”的列?
    • 一种方法是使用带有 OR(管道)的正则表达式:grep("data|sample", names(df))
    • 当我使用此代码并运行它时,会显示“未定义尺寸”错误。请帮帮我
    猜你喜欢
    • 1970-01-01
    • 2015-11-21
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-07-31
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多