【问题标题】:R - join data frames and filter whole groupsR - 加入数据框并过滤整个组
【发布时间】:2020-12-06 04:53:18
【问题描述】:

我从一个网站获得了三个包含不同食谱的数据框。 第一个是煎饼;第二个是法式吐司;第三个是班尼迪克蛋。然后我将这三个表合并为一个表,我称之为 recipes_list。

# pancakes
# Good Old Fashioned Pancakes
ingredients <- c("flour", "baking powder", "salt", "white sugar", "milk", "egg(s)", "butter")
amount <- c(1.5, 3.5, 1, 1, 1.25, 1, 3)
measure <- c("cup(s)", "teaspoon(s)", "teaspoon(s)", "tablespoon(s)", "cup(s)", "", "tablespoon(s)") 
pancake_data <- data.frame(ingredients, amount, measure)
pancake_data <- pancake_data %>%
  mutate(recipe = "pancakes")

# french toast
# Vanilla-Almond Spiced French Toast
ingredients <- c("milk", "sugar", "egg(s)", "vanilla extract", "cinnamon", "nutmeg", "allspice", "toast")
amount <- c(2, 2, 4, 1, 0.5, 0.25, 0.125, 8)
measure <- c("cup(s)", "tablespoon(s)", "", "teaspoon(s)", "teaspoon(s)", "teaspoon(s)", "teaspoon(s)", "slice(s)") 
french_toast_data <- data.frame(ingredients, amount, measure)
french_toast_data <- french_toast_data %>%
  mutate(recipe = "vanilla-almond spiced french toast")

# eggs benedict
ingredients <- c("egg yolk(s)", "lemon juice", "pepper", "Worcestershire sauce", "water", "butter", "salt", "eggs", "white vinegar", "Canadian-style bacon", "English muffins", "butter")
amount <- c(4, 3.5, 1, 0.125, 1, 1, 0.25, 8, 1, 8, 4, 2)
measure <- c("", "tablespoon(s)", "pinch", "teaspoon(s)", "tablespoon(s)", "cup", "teaspoon(s)", "", "teaspoon(s)", "strip(s)", "", "tablespoon(s)") 
eggs_benedict_data <- data.frame(ingredients, amount, measure)
eggs_benedict_data <- eggs_benedict_data %>%
  mutate(recipe = "eggs benedict")

recipe_list <- rbind(pancake_data, french_toast_data, eggs_benedict_data)

现在假设我盘点了冰箱里的东西,然后我想出了这张桌子:

current_fridge <- c("flour", "baking powder", "salt", "white sugar", "milk", "egg(s)", "butter", "milk", "sugar", "egg(s)", "vanilla extract", "cinnamon", "nutmeg", "toast")
amount <- c(1.5, 3.5, 1, 1, 1.25, 1, 3, 2, 2, 4, 1, 0.5, 0.25, 8)
measure <- c("cup(s)", "teaspoon(s)", "teaspoon(s)", "tablespoon(s)", "cup(s)", "", "tablespoon(s)","cup(s)", "tablespoon(s)", "", "teaspoon(s)", "teaspoon(s)", "teaspoon(s)", "slice(s)") 
current_fridge_data <- data.frame(current_fridge, amount, measure)

我知道我可以使用半连接或类似的东西来通过 current_fridge_data 中的内容过滤 recipe_list。但是我怎样才能做到这一点,以便我只包括具有所有可用成分的食谱(没有遗漏一个?)我正在尝试创建一个我可以调用的新数据框:可能的食谱给定的成分。如果我想添加鸡蛋佛罗伦萨或其他东西,是否有一个灵活的答案?

【问题讨论】:

  • 嗨@RonakShah 抱歉回复晚了。我花了一些时间重新思考这个问题,以便它更有意义,而且我一直在反复思考这个问题。我认为食谱在这里是一个很好的占位符。所以我重新写了这个问题。我希望它现在更有意义。

标签: r join filter dplyr


【解决方案1】:

对于每个recipe,您可以检查冰箱中是否存在所需的所有成分,以及冰箱中的数量是否大于准备食谱所需的数量。

library(dplyr)

recipe_list %>%
  left_join(current_fridge_data, by = c('ingredients' = 'current_fridge')) %>%
  group_by(recipe) %>%
  summarise(all_ingredient_present= all(amount.x <= amount.y & !is.na(amount.y)))

# recipe                             all_ingredient_present
#  <chr>                              <lgl>                 
#1 eggs benedict                      FALSE                 
#2 pancakes                           TRUE                  
#3 vanilla-almond spiced french toast FALSE       

【讨论】:

    【解决方案2】:

    我建议使用 if 语句来确定是否应附加 french_toast_data。检查是否在 french_toast_data 中找到 current_fridge_data 的每个唯一元素。如果答案是否定的,那么甚至不要将 french_toast_data 放入 what_can_I_make。如果比较返回所有 TRUE,则总和将等于 unique(current_fridge_data) 的长度

    
    what_can_I_make <- ()
    
    if (sum(unique(current_fridge_data$list) %in% unique(french_toast_data$list)) == length(unique(current_fridge_data$list))) {
      
      what_can_I_make <- rbind(what_can_I_make, french_toast_data)
      
    }
    
    

    【讨论】:

    • 嗨@andrea 谢谢你的建议。这似乎是一个超级智能的解决方案,但我不确定它是否足够灵活。我花了一天时间想出了一个更好的例子后重新写了这个问题。这个新示例还有另一个表。
    • 我接受了解决方案的重命名建议以匹配新示例。从现在给出的例子来看,Ronak 的方法远远不够。
    猜你喜欢
    • 2018-09-04
    • 1970-01-01
    • 2019-01-20
    • 2022-10-07
    • 2021-06-29
    • 1970-01-01
    • 1970-01-01
    • 2014-10-10
    • 1970-01-01
    相关资源
    最近更新 更多