【问题标题】:R dplyr subset with missing columns缺少列的 R dplyr 子集
【发布时间】:2020-07-23 22:10:50
【问题描述】:

我有以下代码,并希望将列选择到新的data.frame

library(dplyr)
df = data.frame(
    Manhattan=c(1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0), 
    Brooklyn=c(0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0), 
    The_Bronx=c(1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0), 
    Staten_Island=c(0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), 
    "2012"=c("P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q"), 
    "2013"=c("P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q"), 
    "2014"=c("P", "P", "P", "Q", "Q", "P", "P", "Q", "Q", "Q", "Q", "Q", "P", "Q", "P", "P", "P", "Q", "Q"), 
    "2015"=c("P", "P", "P", "P", "P", "Q", "Q", "Q", "P", "Q", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q"), check.names=FALSE)
df2 <- subset(df, select = c("Manhattan", "Queens", "The_Bronx"))

这会引发错误:

Error in [.data.frame`(x, r, vars, drop = drop) : 
   undefined columns selected

因为df 中缺少“Queens”列。如何覆盖该错误,以便 R 继续创建 df2 仅包含“Manhattan”和“The_Bronx”列?

非常重要:我的真实数据有数百列,因此无法从命令df2 &lt;- subset(df, select = c("Manhattan", "Queens", "The_Bronx")) 中手动删除“Queens”之类的列(除非有什么技巧?)。有没有办法解决这个问题?谢谢。

【问题讨论】:

    标签: r dataframe dplyr subset


    【解决方案1】:

    在基本 R 中,您可以使用 intersect 仅选择存在的名称。

    cols <- c("Manhattan", "Queens", "The_Bronx")
    subset(df, select = intersect(names(df), cols))
    
    #   Manhattan The_Bronx
    #1          1         1
    #2          1         1
    #3          0         0
    #4          1         0
    #5          1         0
    #6          1         0
    #7          1         0
    #8          0         0
    #...
    #....
    

    或者在dplyr中使用any_of

    library(dplyr)
    df %>% select(tidyselect::any_of(cols))
    

    【讨论】:

    • 哇!了不起!非常感谢您的快速解决方案! :)
    • 嗨,为什么我的 R 返回“错误:'any_of' 不是从 'namespace:tidyselect' 导出的对象”?
    • @Negrito 您可能需要升级到最新版本的tidyselect。你的packageVersion('tidyselect') 是什么?我的是 1.0.0 。
    【解决方案2】:

    我们也可以

    cols <- c("Manhattan", "Queens", "The_Bronx")
    library(dplyr)
    df %>%
       select(matches(str_c(cols, collapse="|")))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-11-06
      • 1970-01-01
      • 1970-01-01
      • 2017-05-20
      • 1970-01-01
      • 2021-10-25
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多