【发布时间】:2021-03-26 15:50:11
【问题描述】:
我有两个数据框。
symbols <- c("Santa", "Elves", "Candy Cane", "Reindeers", "Cats",
"Turkey", "Mashed Potatoes", "Cranberry Sauce", "Dogs",
"Eggs", "Chocolates with cream", "Bunnies", "Flowers", "Donut")
df1 <- data.frame(symbols)
df1
symbols
1 Santa
2 Elves
3 Candy Cane
4 Reindeers
5 Cats
6 Turkey
7 Mashed Potatoes
8 Cranberry Sauce
9 Dogs
10 Eggs
11 Chocolates with cream
12 Bunnies
13 Flowers
14 Donut
holiday <- c("Christmas", "Thanksgiving", "Easter")
v1 <- c("Santa", "Turkey", "Eggs")
v2 <- c("Elves", "Mashed Potatoes", "Chocolates with cream")
v3 <- c("Candy Canes", "Cranberry Sauce", "Bunnies")
v4 <- c("Reindeers", NA, "Flowers")
df2 <- data.frame(holiday, v1, v2, v3, v4)
df2
holiday v1
1 Christmas Santa
2 Thanksgiving Turkey
3 Easter Eggs
v2 v3
1 Elves Candy Canes
2 Mashed Potatoes Cranberry Sauce
3 Chocolates with cream Bunnies
v4
1 Reindeers
2 <NA>
3 Flowers
如果 df1$symbols 中的任何内容与 df2 中的任何值(df2$holiday、df2$v1、df2$v2、df2$v3、df2$v4)匹配,我希望它将 df2$holiday 值输出到df1 中的新列。
理想情况下,我会有一个如下所示的 df1:
df1
symbols holiday
1 Santa Christmas
2 Elves Christmas
3 Candy Cane Christmas
4 Reindeers Christmas
5 Cats <NA>
6 Turkey Thanksgiving
7 Mashed Potatoes Thanksgiving
8 Cranberry Sauce Thanksgiving
9 Dogs <NA>
10 Eggs Easter
11 Chocolates with cream Easter
12 Bunnies Easter
13 Flowers Easter
14 Donut <NA>
我认为我可以做到的一种方法是将 df2 拆分,然后为每一列执行 left_join:
df2_v1 <- data.frame(df2$holiday, df2$v1)
df2_v2 <- data.frame(df2$holiday, df2$v2)
df2_v3 <- data.frame(df2$holiday, df2$v3)
df2_v4 <- data.frame(df2$holiday, df2$v4)
Then I can use left_join for each df1 with df2_v#. For example:
df1_x <- left_join(df1, df2_v1, by = c("symbols" = "df2.v1"))
然后我可以合并或使用一些 ifelse 逻辑来获得一个干净的 df1$holiday 列,但如果 df2 中有更多列,这将非常耗时。
有更快的方法吗?
【问题讨论】: