这比我想象的要难。我认为以下方法可行,但希望有人能想出更漂亮的东西。
首先,我们制作一些数据以供使用(请以后自己做),如果其他人想尝试,我会包括在内:
library(tidyverse)
df1 <- tribble(
~state, ~group, ~species,
"CA", 2, "cat, dog, chicken, mouse",
"CA", 1, "cat",
"NV", 1, "dog, chicken",
"NV", 2, "chicken",
"WA", 1, "chicken, rat, mouse, lion",
"WA", 2, "dog, cat",
"WA", 3, "dog, chicken",
"WA", 4, "cat, chicken")
df2 <- tribble(
~state, ~special_species,
"CA", "cat",
"CA", "chicken",
"CA", "mouse",
"WA", "cat",
"WA", "chicken",
"NV", "dog")
那么解决方法是:
df1 %>%
separate_rows(species) %>%
full_join(df2, on = "state") %>%
filter(species == special_species) %>%
group_by(state, group) %>%
summarise(species = paste(special_species, collapse = ", ")) %>%
full_join(df1, by = c("state" = "state", "group" = "group")) %>%
select(state, group, special_species = species.x) %>%
arrange(state)
#> Joining, by = "state"
#> # A tibble: 8 x 3
#> # Groups: state [3]
#> state group special_species
#> <chr> <dbl> <chr>
#> 1 CA 1 cat
#> 2 CA 2 cat, chicken, mouse
#> 3 NV 1 dog
#> 4 NV 2 <NA>
#> 5 WA 1 chicken
#> 6 WA 2 cat
#> 7 WA 3 chicken
#> 8 WA 4 cat, chicken
如果您接受格式略有不同的所需输出,则代码可以大大简化,例如以下是正确的保存NA:
df1 %>%
separate_rows(species) %>%
full_join(df2, on = "state") %>%
filter(species == special_species) %>%
group_by(state, group) %>%
summarise(species = paste(special_species, collapse = ", "))
#> Joining, by = "state"
#> # A tibble: 7 x 3
#> # Groups: state [3]
#> state group species
#> <chr> <dbl> <chr>
#> 1 CA 1 cat
#> 2 CA 2 cat, chicken, mouse
#> 3 NV 1 dog
#> 4 WA 1 chicken
#> 5 WA 2 cat
#> 6 WA 3 chicken
#> 7 WA 4 cat, chicken
由reprex package (v0.3.0) 于 2019 年 12 月 3 日创建