如何使用 R 中的 filter 和 str_detect 过滤具有部分匹配对的数据？答案

【问题标题】：How do I filter for data that has partially matching pairs using filter and str_detect in R?如何使用 R 中的 filter 和 str_detect 过滤具有部分匹配对的数据？
【发布时间】：2021-08-20 20:29:24
【问题描述】：

我正在尝试过滤具有匹配组的数据，如果没有匹配组，我想删除这些观察结果。

例如，如果我有一个数据集：

#  condition   group     type
#1   apple_1       B    small
#2   apple_1       A    small
#3   apple 1       A    small
#4   apple_2       A    small
#5   apple_2       A    small
#6   apple_3       A    small
#7    pear_1       A    small
#8    pear_1       A    small
#9    pear_1       A    small
#10   pear_2       A    small
#11   pear_3       A    small

在这里，我决定每个苹果观察必须按其编号与每对观察配对（例如，apple_3 应与 pear_3 配对）。所以我们可以看到，由于只有一个 pear_2 观察值，因此应该删除 apple_2 观察值之一，因为有两个 apple_2 观察值。另外，由于第一个 apple_1 在 B 组中，因此不匹配任何梨，因此应删除组 B 的 apple_1，并且应删除 pear_1 观察，因为它没有匹配对。

这里的问题是观察是使用下划线命名的，所以我需要以某种方式操纵str_detect，并且组需要匹配，所以我也需要使用filter。我觉得这种类型的过滤可以使用dplyr 来完成，但我不确定。

预期结果应该是：我正在寻找的预期结果是这样的：

#  condition   group     type
#1   apple_1       A    small
#2   apple_1       A    small
#3   apple_2       A    small
#4   apple_3       A    small
#5    pear_1       A    small
#6    pear_1       A    small
#7    pear_2       A    small
#8    pear_3       A    small

这样每个具有特定编号的苹果都有一个匹配的具有相同编号的梨。

【问题讨论】：

为什么不用strsplit在condition中的下划线后面创建一个后缀为1,2,3的新变量？
嗨，埃里克，欢迎来到 SO。您能否举例说明您的预期结果是什么？另外，我认为您在 3 行中错过了下划线（我认为应该是 apple_1）
好的，我会尝试使用strsplit 看看它的去向。谢谢。
在tidyverse 中，您还可以找到用于此目的的tidyr::separate。
太好了，我也将使用tidyverse 进行探索。

标签： r filter dplyr stringr

【解决方案1】：

你可以这样做：

vec_drop <- function(x){
  b <- table(x)
  if(length(b)<2) return(FALSE)
  a <- split(!logical(length(x)), x)
  if (length(unique(b))>1)
    a[[names(which.max(b))]][seq(abs(diff(b)))] <- FALSE
  unsplit(a, x)
}


df %>%
  group_by(group, cond = str_remove(condition, "\\w+_"))%>%
  filter(vec_drop(condition))


condition group type  cond 
  <chr>     <chr> <chr> <chr>
1 apple_1   A     small 1    
2 apple_1   A     small 1    
3 apple_2   A     small 2    
4 apple_3   A     small 3    
5 pear_1    A     small 1    
6 pear_1    A     small 1    
7 pear_2    A     small 2    
8 pear_3    A     small 3    
>

【讨论】：