如何改变新的 tibble 列以包含名称在 tibble 中作为字符串给出的数据框？答案

【问题标题】：How to mutate a new tibble column to include a data frame whose name is given as a string in the tibble?如何改变新的 tibble 列以包含名称在 tibble 中作为字符串给出的数据框？
【发布时间】：2022-01-20 22:24:34
【问题描述】：

作为涉及nesting 和unnesting 与{tidyr} 的争吵管道的一部分，我有一个看起来像这样的对象：

library(dplyr)

trb <-
  tibble(data_mtcars      = rep(list(mtcars), 3),
         data_iris        = rep(list(iris), 3),
         data_trees       = rep(list(trees), 3),
         person           = c("lucy", "dan", "john"),
         chosen_data_name = c("data_mtcars", "data_iris", "data_trees"))

trb
#> # A tibble: 3 x 5
#>   data_mtcars    data_iris      data_trees    person chosen_data_name
#>   <list>         <list>         <list>        <chr>  <chr>           
#> 1 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> lucy   data_mtcars     
#> 2 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> dan    data_iris       
#> 3 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> john   data_trees

我想改变一个新列，其中将包含来自1:3 列中的数据集，其名称在chosen_data_name 列中给出。

换句话说，我想利用get() 的功能将字符串替换为具有相同名称的对象。但与对get() 的正常调用不同，字符串所指的对象存储在tibble 中，而不是存储在全局环境中（故意）。

这是为什么

trb %>% 
  mutate(chosen_data_here = get(chosen_data_name))

没用。

如何改变trb 中的新列以获得输出：

## # A tibble: 3 x 6
##   data_mtcars    data_iris      data_trees    person chosen_data_name chosen_data_here
##   <list>         <list>         <list>        <chr>  <chr>            <list>          
## 1 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> lucy   data_mtcars      <df [32 x 11]> # mtcars 
## 2 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> dan    data_iris        <df [150 x 5]> # iris 
## 3 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> john   data_trees       <df [31 x 3]>  # trees

【问题讨论】：

标签： r dplyr tibble

【解决方案1】：

使用nest 和rowwise：

trb %>% 
  rowwise() %>% 
  mutate(nest(get(chosen_data_name), chosen_data_here = everything()))

# A tibble: 3 x 6
# Rowwise: 
  data_mtcars    data_iris      data_trees    person chosen_data_name chosen_data_here  
  <list>         <list>         <list>        <chr>  <chr>            <list>            
1 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> lucy   data_mtcars      <tibble [32 x 11]>
2 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> dan    data_iris        <tibble [150 x 5]>
3 <df [32 x 11]> <df [150 x 5]> <df [31 x 3]> john   data_trees       <tibble [31 x 3]>

【讨论】：

【解决方案2】：

不确定这是否像您想要的那样“自动”，但这是可行的：

trb %>% 
  mutate(chosen_data_here = case_when(
    chosen_data_name == "data_mtcars" ~ data_mtcars,
    chosen_data_name == "data_iris" ~ data_iris,
    chosen_data_name == "data_trees" ~ data_trees
  ))

# A tibble: 3 × 6
#   data_mtcars    data_iris      data_trees    person chosen_data_name chosen_data_here
#   <list>         <list>         <list>        <chr>  <chr>            <list>          
# 1 <df [32 × 11]> <df [150 × 6]> <df [31 × 3]> lucy   data_mtcars      <df [32 × 11]>  
# 2 <df [32 × 11]> <df [150 × 6]> <df [31 × 3]> dan    data_iris        <df [150 × 6]>  
# 3 <df [32 × 11]> <df [150 × 6]> <df [31 × 3]> john   data_trees       <df [31 × 3]>

【讨论】：

谢谢！但是正如您所暗示的那样，这种方法是有限的，不是非常可扩展的。在我的真实数据中，我有不止 3 个“chosen_data_names”，所以它会变得乏味。