在R中使用条件语句取消过滤数据框答案

【问题标题】：Unfilter data frame with conditional statement in R在R中使用条件语句取消过滤数据框
【发布时间】：2021-07-10 04:00:21
【问题描述】：

我有两个不同的数据框 DF1 和 DF2。我想排除与数据框 DF2 匹配的 DF1 行，我生成的数据框看起来像 DF3。此外我想通过条件如果我的房间号是 All Rooms 那么我将能够匹配列 Code, Description and Company 从 DF2 到 DF1，如果我的房间号列不包含所有房间，那么它应该与列代码、描述、公司和房间号匹配。

Code=c("A","B","C","E","D")
Desciption=c("Color is not Good","Odour is not good","Astetic Issue","Odour is not good","Lighting issue")
Company=c("Asian Paints","Burger","Asian Paints","Burger","Burger")
`Room number`=c("Room_1","Room_1","Room_2","Room_3","Room_2")
Rating=c("2","3","5","4","3")

DF1=data.frame(Code,Desciption,Company,`Room number`,Rating)

  Code        Desciption      Company Room.number Rating
1    A Color is not Good Asian Paints      Room_1      2
2    B Odour is not good       Burger      Room_1      3
3    C     Astetic Issue Asian Paints      Room_2      5
4    E Odour is not good       Burger      Room_3      4
5    D    Lighting issue       Burger      Room_2      3

Code=c("A","B")
Desciption=c("Color is not Good","Odour is not good")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_1","All Rooms")

DF2=data.frame(Code,Desciption,Company,`Room number`)


> DF2
  Code        Desciption      Company Room.number
1    A Color is not Good Asian Paints      Room_1
2    B Odour is not good       Burger   All Rooms


Code=c("C","D")
Desciption=c("Astetic Issue","Lighting issue")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_2","Room_2")
Rating=c("5","3")

DF3=data.frame(Code,Desciption,Company,`Room number`,Rating)

> DF3
  Code     Desciption      Company Room.number Rating
1    C  Astetic Issue Asian Paints      Room_2      5
2    D Lighting issue       Burger      Room_2      3

谁能帮我解决这个问题

【问题讨论】：

列名中不允许有空格。您确定要（或在文件中）Room number 作为列名吗？如果是这样，请将其放在反引号之间。
是我输入代码的错误。我将编辑代码。
为什么代码E被过滤掉了？

标签： r dataframe filter dplyr subset

【解决方案1】：

你提到了

另外，我希望将条件传递为如果我的房间号是所有房间，那么我将能够将列 Code、Description 和 Company 从 DF2 匹配到 DF1，......

目前尚不清楚在这种特定情况下（所有房间）您是否要检查 description & company 中的所有 codes DF1？如果是的话，下面的语法就可以了..

否则，如果所有组合都必须在所有可能组合的所有列中签入DF1（即code、description 和company），则过滤后的行将为0。请重新检查您的逻辑并相应地修改问题

DF1 %>% anti_join(DF2, by = c("Code", "Desciption", "Company", "Room.number")) %>%
  anti_join(DF2 %>% filter(Room.number == "All Rooms") %>% 
              mutate(Code = list(unique(DF1$Code))) %>% 
              unnest_longer(Code) , 
            by = c("Code", "Desciption", "Company"))

  Code     Desciption      Company Room.number Rating
1    C  Astetic Issue Asian Paints      Room_2      5
2    D Lighting issue       Burger      Room_2      3

【讨论】：

是的，我想检查 DF1 中所有代码的描述和公司。你的解决方案对我有用。非常感谢

【解决方案2】：

这是一种基本的 R 向量化方法，用于过滤掉匹配多个条件的行。它创建逻辑索引，然后基于这些索引子集DF1。 DF3b 和预期结果 DF3 之间的唯一区别在于行名，因此我将它们设置为连续值。

i_all_rooms <- DF1[["Room.number"]] == "All Rooms"
i1 <- !DF1[["Code"]] %in% DF2[["Code"]]
i2 <- !DF1[["Desciption"]] %in% DF2[["Desciption"]]
i3 <- !DF1[["Company"]] %in% DF2[["Company"]]
i4 <- !DF1[["Room.number"]] %in% DF2[["Room.number"]]

j1 <- i_all_rooms & i1 & (i2 | i3)
j2 <- !i_all_rooms & i1 & (i2 | i3) & i4

DF3b <- DF1[j1 | j2, ]
row.names(DF3b) <- NULL

identical(DF3, DF3b)
#[1] TRUE

【讨论】：