【问题标题】:Filtering multiple columns of data frame inside a loop in R在R中的循环内过滤多列数据框
【发布时间】:2020-09-25 14:18:49
【问题描述】:

我想使用循环过滤数据框的多个列,删除任何给定列值在特定列表中的行。

例如:

> my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
> color_words = c("red", "orange", "yellow", "green", "blue")
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

使用 dplyr filter() 函数:

> my_df %>% filter(!word1 %in% color_words) %>% filter(!word2 %in% color_words)
  word1 word2 word3
1   one apple   red

我第一次尝试循环执行此过滤是:

col_names <- c("word1","word2")
for(col in col_names){
    my_df <- my_df %>% filter(!col %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

我在使用filter()时读到了quoting and unquoting,所以我也尝试了:

for(col in col_names){
    col <- enquo(col)
    my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

for(col in col_names){
    my_df <- my_df %>% filter(!UQ(col) %in% color_words)
}
> my_df
  word1  word2  word3
1   one  apple    red
2   two orange orange
3   red banana yellow
4  blue   pear  green

通过循环进行此过滤的正确方法是什么?

【问题讨论】:

    标签: r dataframe dplyr


    【解决方案1】:

    您不需要循环,您可以使用 filteracross 为多个列应用函数

    library(dplyr)
    my_df %>% filter(across(all_of(col_names), ~!. %in% color_words))
    
    #  word1 word2 word3
    #1   one apple   red
    

    如果您有旧版本的dplyr,请使用filter_at

    my_df %>% filter_at(col_names, all_vars(!. %in% color_words))
    

    【讨论】:

    • 对不起,我的问题并不清楚,但总的来说我不想过滤 all 的列,只过滤特定列表中的列(在这种情况下col_names).
    • 谢谢。我仍然很好奇为什么我使用循环的尝试都没有成功?
    【解决方案2】:

    使用base

    my_df <- data.frame(word1 = c("one", "two", "red", "blue"), word2 = c("apple","orange","banana","pear"), word3 = c("red", "orange", "yellow", "green"))
    color_words <-  paste0(c("red", "orange", "yellow", "green", "blue"), collapse = "|") 
    fltr <- apply(my_df[1:2], 1, function(x) !any(grepl(color_words, x)))
    my_df[fltr, ]
    #>   word1 word2 word3
    #> 1   one apple   red
    

    reprex package (v0.3.0) 于 2020 年 9 月 25 日创建

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-11-21
      • 1970-01-01
      • 1970-01-01
      • 2012-07-02
      • 1970-01-01
      • 1970-01-01
      • 2020-01-24
      • 2020-08-20
      相关资源
      最近更新 更多