数据框中集群所有行的条件子集答案

【问题标题】：conditional subsetting of all rows of a cluster in a dataframe数据框中集群所有行的条件子集
【发布时间】：2021-10-03 15:09:18
【问题描述】：

在下面的 data.frame 中，我想知道如何有条件地对每个 study 的所有行进行子集化，其中 group 是恒定的，但 outcome 是变化的。

h = "
study outcome group
a     1       1
a     2       1
b     1       1
b     1       2
c     2       1
c     3       2
d     1       1
d     1       1
e     1       1"
h = read.table(text = h,h=T)

#DESIRED OUTPUT:
  
#study outcome group 
#a     1       1
#a     2       1

【问题讨论】：

标签： r dataframe subset

【解决方案1】：

我们可以使用n_distinct在filter中按'study'分组后创建条件

library(dplyr)
h %>% 
    group_by(study) %>%
    filter(n_distinct(group) == 1, n_distinct(outcome) > 1)
# A tibble: 2 x 3
# Groups:   study [1]
  study outcome group
  <chr>   <int> <int>
1 a           1     1
2 a           2     1

或使用base R

subset(h, ave(group, study, FUN = function(x) length(unique(x)))
      == 1 & ave(outcome, study, FUN = function(x) length(unique(x)) > 1))
  study outcome group
1     a       1     1
2     a       2     1

如果我们愿意，我们可以概括

f1 <- function(dat, cond) {

  switch(cond, 

   `1` = dat %>% 
       group_by(study) %>%
       filter(n_distinct(group) == 1, n_distinct(outcome) > 1) %>%
       ungroup,
    `2` = dat %>% 
         group_by(study) %>%
         filter(n_distinct(group) > 1, n_distinct(outcome) == 1) %>%
         ungroup,
         
    `3` =  dat %>% 
         group_by(study) %>%
         filter(n_distinct(group) > 1, n_distinct(outcome) > 1) %>%
         ungroup,
    `4` = dat %>% 
         group_by(study) %>%
         filter(n_distinct(group) == 1, n_distinct(outcome) == 1) %>%
         ungroup
    )

}

-测试

> f1(h, 1)
# A tibble: 2 x 3
  study outcome group
  <chr>   <int> <int>
1 a           1     1
2 a           2     1
> f1(h, 2)
# A tibble: 2 x 3
  study outcome group
  <chr>   <int> <int>
1 b           1     1
2 b           1     2
> f1(h, 3)
# A tibble: 2 x 3
  study outcome group
  <chr>   <int> <int>
1 c           2     1
2 c           3     2
> f1(h, 4)
# A tibble: 3 x 3
  study outcome group
  <chr>   <int> <int>
1 d           1     1
2 d           1     1
3 e           1     1

【讨论】：

@rnorouzian 通过查看您之前的帖子，它比我想象的要简单。我将在这里发布完整的代码
太好了，我对您的功能进行了有趣的跟进。
当然，很乐意。
HERE 是我有趣的后续行动吗？