R - 加入数据框并过滤整个组答案

【问题标题】：R - join data frames and filter whole groupsR - 加入数据框并过滤整个组
【发布时间】：2020-12-06 04:53:18
【问题描述】：

我从一个网站获得了三个包含不同食谱的数据框。第一个是煎饼；第二个是法式吐司；第三个是班尼迪克蛋。然后我将这三个表合并为一个表，我称之为 recipes_list。

# pancakes
# Good Old Fashioned Pancakes
ingredients <- c("flour", "baking powder", "salt", "white sugar", "milk", "egg(s)", "butter")
amount <- c(1.5, 3.5, 1, 1, 1.25, 1, 3)
measure <- c("cup(s)", "teaspoon(s)", "teaspoon(s)", "tablespoon(s)", "cup(s)", "", "tablespoon(s)") 
pancake_data <- data.frame(ingredients, amount, measure)
pancake_data <- pancake_data %>%
  mutate(recipe = "pancakes")

# french toast
# Vanilla-Almond Spiced French Toast
ingredients <- c("milk", "sugar", "egg(s)", "vanilla extract", "cinnamon", "nutmeg", "allspice", "toast")
amount <- c(2, 2, 4, 1, 0.5, 0.25, 0.125, 8)
measure <- c("cup(s)", "tablespoon(s)", "", "teaspoon(s)", "teaspoon(s)", "teaspoon(s)", "teaspoon(s)", "slice(s)") 
french_toast_data <- data.frame(ingredients, amount, measure)
french_toast_data <- french_toast_data %>%
  mutate(recipe = "vanilla-almond spiced french toast")

# eggs benedict
ingredients <- c("egg yolk(s)", "lemon juice", "pepper", "Worcestershire sauce", "water", "butter", "salt", "eggs", "white vinegar", "Canadian-style bacon", "English muffins", "butter")
amount <- c(4, 3.5, 1, 0.125, 1, 1, 0.25, 8, 1, 8, 4, 2)
measure <- c("", "tablespoon(s)", "pinch", "teaspoon(s)", "tablespoon(s)", "cup", "teaspoon(s)", "", "teaspoon(s)", "strip(s)", "", "tablespoon(s)") 
eggs_benedict_data <- data.frame(ingredients, amount, measure)
eggs_benedict_data <- eggs_benedict_data %>%
  mutate(recipe = "eggs benedict")

recipe_list <- rbind(pancake_data, french_toast_data, eggs_benedict_data)

现在假设我盘点了冰箱里的东西，然后我想出了这张桌子：

current_fridge <- c("flour", "baking powder", "salt", "white sugar", "milk", "egg(s)", "butter", "milk", "sugar", "egg(s)", "vanilla extract", "cinnamon", "nutmeg", "toast")
amount <- c(1.5, 3.5, 1, 1, 1.25, 1, 3, 2, 2, 4, 1, 0.5, 0.25, 8)
measure <- c("cup(s)", "teaspoon(s)", "teaspoon(s)", "tablespoon(s)", "cup(s)", "", "tablespoon(s)","cup(s)", "tablespoon(s)", "", "teaspoon(s)", "teaspoon(s)", "teaspoon(s)", "slice(s)") 
current_fridge_data <- data.frame(current_fridge, amount, measure)

我知道我可以使用半连接或类似的东西来通过 current_fridge_data 中的内容过滤 recipe_list。但是我怎样才能做到这一点，以便我只包括具有所有可用成分的食谱（没有遗漏一个？）我正在尝试创建一个我可以调用的新数据框：可能的食谱给定的成分。如果我想添加鸡蛋佛罗伦萨或其他东西，是否有一个灵活的答案？

【问题讨论】：

嗨@RonakShah 抱歉回复晚了。我花了一些时间重新思考这个问题，以便它更有意义，而且我一直在反复思考这个问题。我认为食谱在这里是一个很好的占位符。所以我重新写了这个问题。我希望它现在更有意义。

标签： r join filter dplyr

【解决方案1】：

对于每个recipe，您可以检查冰箱中是否存在所需的所有成分，以及冰箱中的数量是否大于准备食谱所需的数量。

library(dplyr)

recipe_list %>%
  left_join(current_fridge_data, by = c('ingredients' = 'current_fridge')) %>%
  group_by(recipe) %>%
  summarise(all_ingredient_present= all(amount.x <= amount.y & !is.na(amount.y)))

# recipe                             all_ingredient_present
#  <chr>                              <lgl>                 
#1 eggs benedict                      FALSE                 
#2 pancakes                           TRUE                  
#3 vanilla-almond spiced french toast FALSE

【讨论】：

【解决方案2】：

我建议使用 if 语句来确定是否应附加 french_toast_data。检查是否在 french_toast_data 中找到 current_fridge_data 的每个唯一元素。如果答案是否定的，那么甚至不要将 french_toast_data 放入 what_can_I_make。如果比较返回所有 TRUE，则总和将等于 unique(current_fridge_data) 的长度


what_can_I_make <- ()

if (sum(unique(current_fridge_data$list) %in% unique(french_toast_data$list)) == length(unique(current_fridge_data$list))) {
  
  what_can_I_make <- rbind(what_can_I_make, french_toast_data)
  
}

【讨论】：

嗨@andrea 谢谢你的建议。这似乎是一个超级智能的解决方案，但我不确定它是否足够灵活。我花了一天时间想出了一个更好的例子后重新写了这个问题。这个新示例还有另一个表。
我接受了解决方案的重命名建议以匹配新示例。从现在给出的例子来看，Ronak 的方法远远不够。