通过 `...` 扩展 tidyverse 函数答案

【问题标题】：Extending a tidyverse function via `...`通过 `...` 扩展 tidyverse 函数
【发布时间】：2021-10-26 21:25:25
【问题描述】：

我试图通过允许用户定义任意数量的参数来代替... 来扩展我的foo 函数。

这些... 参数将完全被视为当前的 3 个参数（time、outcome、trt_gr）。

这在 R 中可能吗？

foo <- function(time = 1, outcome = 1, trt_gr = 1, ...){

  time <- seq_len(time)
  outcome <- seq_len(outcome)
  trt_gr <- seq_len(trt_gr)
  
data <- expand.grid(time = time, outcome = outcome, trt_gr = trt_gr, info. = c("control","treatment"))

data %>% 
  group_by(outcome, time, trt_gr) %>%
  summarise(info. = str_c(sort(info., decreasing = TRUE), 
                          collapse = ' vs. '), .groups = 'drop') 
}

# EXAMPLE OF CURRENT USE:

foo()

#  outcome  time trt_gr info.                
#    <int> <int>  <int> <chr>                
#1       1     1      1 treatment vs. control

【问题讨论】：

标签： r dataframe function dplyr tidyverse

【解决方案1】：

是的，这是可能的。我们可以用省略号 ... 替换您的参数，并允许该函数生成任意数量的具有自定义列名的列。这是 tidyverse 风格的这样一个函数：

library(tidyverse)

foo <- function(...){
  
  dots <- rlang::list2(...) 
  var_nms <- names(dots)
  inp <- purrr::map(dots, seq_len)

  data <- tidyr::expand_grid(!!! inp,
                             info. = c("control","treatment"))
  
  data %>% 
    dplyr::group_by(!!!syms(var_nms)) %>%
    dplyr::summarise(info. = stringr::str_c(sort(info., decreasing = TRUE), 
                                            collapse = ' vs. '), .groups = 'drop') 
}

foo(time = 1, outcome = 1, trt_gr = 1)
#> # A tibble: 1 x 4
#>    time outcome trt_gr info.                
#>   <int>   <int>  <int> <chr>                
#> 1     1       1      1 treatment vs. control

foo(some = 2, new = 1, colnames = 3)
#> # A tibble: 6 x 4
#>    some   new colnames info.                
#>   <int> <int>    <int> <chr>                
#> 1     1     1        1 treatment vs. control
#> 2     1     1        2 treatment vs. control
#> 3     1     1        3 treatment vs. control
#> 4     2     1        1 treatment vs. control
#> 5     2     1        2 treatment vs. control
#> 6     2     1        3 treatment vs. control

^{由reprex package (v0.3.0) 于 2021 年 8 月 26 日创建}

更新

回答 cmets 中添加的问题。是的，我们可以通过以下方式对上面的函数进行矢量化，这也允许在运行中跳过包含0 的列：

library(tidyverse)

foo <- function(...){
  
  dots <- rlang::list2(...) 
  var_nms <- names(dots)
  inp_ls <- map(dots, ~ map(.x, seq_len)) %>% transpose %>% map(compact)
  
  data_ls <- map(inp_ls, 
                 ~ tidyr::expand_grid(!!! .x,
                                      info. = c("control","treatment")))
  
  map2(data_ls, inp_ls, ~ .x %>% 
        dplyr::group_by(!!!syms(names(.y))) %>%
        dplyr::summarise(info. = stringr::str_c(sort(info., decreasing = TRUE), 
                                                collapse = ' vs. '), .groups = 'drop')) 
}

foo(some = c(1,2), new = c(1,0), colnames = c(1,3))
#> [[1]]
#> # A tibble: 1 x 4
#>    some   new colnames info.                
#>   <int> <int>    <int> <chr>                
#> 1     1     1        1 treatment vs. control
#> 
#> [[2]]
#> # A tibble: 6 x 3
#>    some colnames info.                
#>   <int>    <int> <chr>                
#> 1     1        1 treatment vs. control
#> 2     1        2 treatment vs. control
#> 3     1        3 treatment vs. control
#> 4     2        1 treatment vs. control
#> 5     2        2 treatment vs. control
#> 6     2        3 treatment vs. control

^{由reprex package (v0.3.0) 于 2021 年 8 月 26 日创建}

【讨论】：

@Reza：是的，这也是可能的。请参阅我的更新答案。
@Reza：我想我需要看一个例子来说明你的意思。 “第 1 轮”是指列表中的第一个 data.frame？而且“控制”不是另一列，而是我们用来控制某些东西的特殊参数（我还没有理解那部分）。
@Reza：现在我明白了。我更新了答案！
HERE 是一个有趣的问题。