【问题标题】:Unfilter data frame with conditional statement in R在R中使用条件语句取消过滤数据框
【发布时间】:2021-07-10 04:00:21
【问题描述】:

我有两个不同的数据框 DF1 和 DF2。我想排除与数据框 DF2 匹配的 DF1 行,我生成的数据框看起来像 DF3。此外 我想通过条件如果我的房间号是 All Rooms 那么我将能够匹配列 Code, Description and Company 从 DF2 到 DF1,如果我的房间号列不包含所有房间,那么它应该与列代码、描述、公司和房间号匹配。

Code=c("A","B","C","E","D")
Desciption=c("Color is not Good","Odour is not good","Astetic Issue","Odour is not good","Lighting issue")
Company=c("Asian Paints","Burger","Asian Paints","Burger","Burger")
`Room number`=c("Room_1","Room_1","Room_2","Room_3","Room_2")
Rating=c("2","3","5","4","3")

DF1=data.frame(Code,Desciption,Company,`Room number`,Rating)

  Code        Desciption      Company Room.number Rating
1    A Color is not Good Asian Paints      Room_1      2
2    B Odour is not good       Burger      Room_1      3
3    C     Astetic Issue Asian Paints      Room_2      5
4    E Odour is not good       Burger      Room_3      4
5    D    Lighting issue       Burger      Room_2      3

Code=c("A","B")
Desciption=c("Color is not Good","Odour is not good")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_1","All Rooms")

DF2=data.frame(Code,Desciption,Company,`Room number`)


> DF2
  Code        Desciption      Company Room.number
1    A Color is not Good Asian Paints      Room_1
2    B Odour is not good       Burger   All Rooms


Code=c("C","D")
Desciption=c("Astetic Issue","Lighting issue")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_2","Room_2")
Rating=c("5","3")

DF3=data.frame(Code,Desciption,Company,`Room number`,Rating)

> DF3
  Code     Desciption      Company Room.number Rating
1    C  Astetic Issue Asian Paints      Room_2      5
2    D Lighting issue       Burger      Room_2      3

谁能帮我解决这个问题

【问题讨论】:

  • 列名中不允许有空格。您确定要(或在文件中)Room number 作为列名吗?如果是这样,请将其放在反引号之间。
  • 是我输入代码的错误。我将编辑代码。
  • 为什么代码E被过滤掉了?

标签: r dataframe filter dplyr subset


【解决方案1】:

你提到了

另外,我希望将条件传递为如果我的房间号是所有房间,那么我将能够将列 Code、Description 和 Company 从 DF2 匹配到 DF1,......

目前尚不清楚在这种特定情况下(所有房间)您是否要检查 description & company 中的所有 codes DF1?如果是的话,下面的语法就可以了..

否则,如果所有组合都必须在所有可能组合的所有列中签入DF1(即codedescriptioncompany),则过滤后的行将为0。请重新检查您的逻辑并相应地修改问题

DF1 %>% anti_join(DF2, by = c("Code", "Desciption", "Company", "Room.number")) %>%
  anti_join(DF2 %>% filter(Room.number == "All Rooms") %>% 
              mutate(Code = list(unique(DF1$Code))) %>% 
              unnest_longer(Code) , 
            by = c("Code", "Desciption", "Company"))

  Code     Desciption      Company Room.number Rating
1    C  Astetic Issue Asian Paints      Room_2      5
2    D Lighting issue       Burger      Room_2      3

【讨论】:

  • 是的,我想检查 DF1 中所有代码的描述和公司。你的解决方案对我有用。非常感谢
【解决方案2】:

这是一种基本的 R 向量化方法,用于过滤掉匹配多个条件的行。它创建逻辑索引,然后基于这些索引子集DF1DF3b 和预期结果 DF3 之间的唯一区别在于行名,因此我将它们设置为连续值。

i_all_rooms <- DF1[["Room.number"]] == "All Rooms"
i1 <- !DF1[["Code"]] %in% DF2[["Code"]]
i2 <- !DF1[["Desciption"]] %in% DF2[["Desciption"]]
i3 <- !DF1[["Company"]] %in% DF2[["Company"]]
i4 <- !DF1[["Room.number"]] %in% DF2[["Room.number"]]

j1 <- i_all_rooms & i1 & (i2 | i3)
j2 <- !i_all_rooms & i1 & (i2 | i3) & i4

DF3b <- DF1[j1 | j2, ]
row.names(DF3b) <- NULL

identical(DF3, DF3b)
#[1] TRUE

【讨论】:

    猜你喜欢
    • 2021-09-18
    • 2023-01-24
    • 2015-06-10
    • 2019-12-09
    • 1970-01-01
    • 1970-01-01
    • 2021-05-21
    • 1970-01-01
    • 2019-06-10
    相关资源
    最近更新 更多