最简单的过滤语法（在 R 中）答案

【问题标题】：Easiest syntax on filtering (in R)最简单的过滤语法（在 R 中）
【发布时间】：2014-04-23 15:26:19
【问题描述】：

我正在尝试找到以最少语法过滤数据集的最简单方法。此示例将包含最少的数据，但我正在尝试找到一种方法将其推广到更大的数据集。

这是我的示例数据集：

samp <- structure(list(group = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 
4L), .Label = c("a", "b", "c", "d"), class = "factor"), name = structure(c(5L, 
3L, 7L, 2L, 6L, 8L, 4L, 1L), .Label = c("hollis", "jo", "joe", 
"mike", "pat", "scott", "steph", "tim"), class = "factor")), .Names = c("group", 
"name"), class = "data.frame", row.names = c(NA, -8L))

假设，我想过滤到 group == 'a' | group == 'b' 的位置。

我试过match，但它只返回第一个匹配项。

filt <- c('a', 'b')
samp[match(filt, s$group), ]
  group name
1     a  pat
2     b  joe

我已经尝试过filter，但是由于有很多过滤参数，语法会变得冗长。

library(dplyr)
filter(samp, group == 'a' | group == 'b')

  group  name
1     a   pat
2     b   joe
3     a scott
4     b   tim

理想情况下，我想找到如下解决方案：

library(dplyr)
filt <- c('a', 'b')
filter(samp, group == any(filt))

  group  name
1     a   pat
2     b   joe
3     a scott
4     b   tim

不幸的是，这会返回以下错误。

[1] group name 
<0 rows> (or 0-length row.names)
Warning message:
In any(c("a", "b")) : coercing argument of type 'character' to logical

提前感谢您的帮助和建议！

【问题讨论】：

标签： r dataframe dplyr

【解决方案1】：

试试%in%:

samp[samp$group %in% c("a", "b"), ]
#   group  name
# 1     a   pat
# 2     b   joe
# 5     a scott
# 6     b   tim

您正在寻找的dplry 方法可能是这样的：

library(dplyr)
filter(samp, group %in% c("a", "b"))
#   group  name
# 1     a   pat
# 2     b   joe
# 3     a scott
# 4     b   tim

这类似于基本 R 的 subset(samp, subset=group %in% c("a", "b"))，但在考虑以非交互方式使用它之前，请注意 ?subset 处的警告。

【讨论】：

哈哈，我打算跟进看看有没有dplyr 的解决方案。感谢这个阿难！