【发布时间】:2021-02-05 10:54:00
【问题描述】:
假设我有以下数据框:
my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,NA),
Tax = c(2,4,5,6,2,3,5,1,3,4,5,6,4,NA))
然后产生:
> my_basket
ITEM_GROUP ITEM_NAME Price Tax
1 Fruit Apple 100 2
2 Fruit Banana 80 4
3 Fruit Orange 80 5
4 Fruit Mango 90 6
5 Fruit Papaya 65 2
6 Vegetable Carrot 70 3
7 Vegetable Potato 60 5
8 Vegetable Brinjal 70 1
9 Vegetable Raddish 25 3
10 Dairy Milk 60 4
11 Dairy Curd 40 5
12 Dairy Cheese 35 6
13 Dairy Milk 50 4
14 Dairy Paneer NA NA
我现在想做的是列出我想要保留的水果,然后过滤它们,所以:
fruitlist = c("Apple", "Banana")
我将如何使用 tidyverse 过滤我的 data.frame 中的数据,只保留我的水果列表中的水果,以及我所有的蔬菜和奶制品?通常我会这样做:
my_basket %<>% filter(ITEM_NAME %in% fruitlist)
但是我也会失去所有的蔬菜和奶制品,这不是我想要的。我一直在尝试用 case_when 做一些事情,但似乎无法让它发挥作用。一定有一些明显的东西我在这里遗漏了。
编辑:发布我的问题几秒钟后,我终于意识到:
my_basket %<>% filter(ITEM_NAME %in% fruitlist | ITEM_GROUP != "Fruit")
这样就解决了。我想如果我必须像这样过滤多个组,那么重复管道过滤器命令也可以。
【问题讨论】: