【问题标题】:How to delay evaluation of function passed as argument to purrr::pmap如何延迟对作为参数传递给 purrr::pmap 的函数的评估
【发布时间】:2019-06-18 13:56:52
【问题描述】:

我正在尝试使用嵌套数据框 (https://r4ds.had.co.nz/many-models.html) 方法使用 lcmm::lcmm()purrr::pmap() 拟合多个潜在类增长曲线。

此过程需要使用 lcmm() 拟合具有一个类 (k = 1) 的模型,然后将此模型用作 lcmm::gridsearch() 的输入,它会从此 k = 1 模型输入到 k = 2+ 类模型中。 gridsearch() 还需要对 k = 2+ 模型的模型调用(加上两个其他参数),它在对 gridsearch() 的调用中作为对 lcmm() 的调用传递。我通常的方法是使用pmap() 将参数列表传递给gridsearch(),但list() 立即评估对lcmm() 的模型调用并尝试拟合模型而不是将模型调用传递给gridsearch() (见confusing behavior of purrr::pmap with rlang; "to quote" or not to quote argument that is the Q)。

NB 使用 RStudio 的函数查看器 (F2),似乎 lcmm::gridsearch() 使用 match.call() 来调整具有用户定义数量的随机起始值的 k = 2+ 模型调用,并且然后遍历这些以找到首选的 k = 2+ 解决方案。

我在下面包含了一个代表。在 pmap 中包装对 gridsearch 的调用时,命令失败并显示“mutate_impl(.data, dots) 中的错误:评估错误:参数的长度为零。” - 我认为这是因为 R 试图评估 k = 2+ 模型对 lcmm() 的调用,但我可能是错的。

当作为参数传递给pmap() 时,如何延迟对lcmm() 的评估?

下面的例子:

library(lcmm)
#> Warning: package 'lcmm' was built under R version 3.5.2
#> Loading required package: survival
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
library(purrr)

# load lcmm example data

data("data_lcmm")

# take sample

set.seed(123)

data_lcmm <-
  data_lcmm %>%
  sample_frac(0.1)




# NB grouping variable is needed to reproduce desired data structure 

data_lcmm <-
  data_lcmm %>%
  mutate(group_var = sample(c(0, 1),
    size = nrow(data_lcmm),
    replace = TRUE
  ))



data_lcmm_nest <-
  data_lcmm %>%
  group_by(group_var) %>%
  nest() %>% 
  mutate(data= map(data, as.data.frame))


# lcmm call from ?lcmm

lcmm_k1 <- function(df) {
  lcmm(Ydep2 ~ Time + I(Time^2),
    random = ~Time, subject = "ID", ng = 1,
    data = data_lcmm_nest$data[[1]], link = "linear"
  )
}


# fit k = 1 models
data_lcmm_nest <-
  data_lcmm_nest %>%
  mutate(lcgm = map(data, lcmm_k1))
#> Be patient, lcmm is running ... 
#> The program took 0.18 seconds 
#> Be patient, lcmm is running ... 
#> The program took 0.19 seconds

# this works for a single row
desired_result <-
  gridsearch(
    m = lcmm(Ydep2 ~ Time + I(Time^2),
      mixture = ~Time,
      random = ~Time, subject = "ID", ng = 2,
      data = data_lcmm_nest$data[[1]], link = "linear"
    ),
    rep = 5,
    maxiter = 2,
    minit = data_lcmm_nest$lcgm[[1]]
  )
#> Be patient, lcmm is running ... 
#> The program took 0.45 seconds 
#> Be patient, lcmm is running ... 
#> The program took 0.45 seconds 
#> Be patient, lcmm is running ... 
#> The program took 0.45 seconds 
#> Be patient, lcmm is running ... 
#> The program took 0.45 seconds 
#> Be patient, lcmm is running ... 
#> The program took 0.47 seconds 
#> Be patient, lcmm is running ... 
#> The program took 0.61 seconds


# this fails with Error in mutate_impl(.data, dots) :
# Evaluation error: argument is of length zero.

data_lcmm_nest %>%
  mutate(lcgm_2 = pmap(
    list(
      m = lcmm(Ydep2 ~ Time + I(Time^2),
        mixture = ~Time,
        random = ~Time, subject = "ID", ng = 2,
        data = data, link = "linear"
      ),
      rep = 5,
      maxiter = 2,
      minit = lcgm
    ), gridsearch
  ))
#> Error in mutate_impl(.data, dots): Evaluation error: argument is of length zero.


# wrapping gridsearch in helper also fails

grid_search_helper <- function(g_rep, g_maxiter, g_minit, g_m) {
  gridsearch(
    m = lcmm(Ydep2 ~ Time + I(Time^2),
      mixture = ~Time,
      random = ~Time, subject = "ID", ng = 2,
      data = g_m, link = "linear"
    ),
    rep = g_rep,
    maxiter = g_maxiter,
    minit = g_minit
  )
}


data_lcmm_nest %>%
  mutate(lcgm_2 = pmap(
    list(
      5,
      2,
      lcgm,
      data
    ), grid_search_helper
  ))
#> Error in mutate_impl(.data, dots): Evaluation error: object 'g_m' not found.

reprex package (v0.2.1) 于 2019 年 1 月 24 日创建

【问题讨论】:

    标签: r purrr


    【解决方案1】:

    这不完全是我原来的问题的答案,因为它不使用purrr,但是使用 for 循环进行迭代没有这个延迟评估问题:

    library(lcmm)
    #> Loading required package: survival
    #> Loading required package: parallel
    library(dplyr)
    #> 
    #> Attaching package: 'dplyr'
    #> The following objects are masked from 'package:stats':
    #> 
    #>     filter, lag
    #> The following objects are masked from 'package:base':
    #> 
    #>     intersect, setdiff, setequal, union
    library(tidyr)
    library(purrr)
    
    
    data("data_lcmm")
    
    # take sample
    
    set.seed(123)
    
    data_lcmm <-
      data_lcmm %>%
      sample_frac(0.1)
    
    
    
    
    # NB grouping variable is needed to reproduce desired data structure 
    
    data_lcmm <-
      data_lcmm %>%
      mutate(group_var = sample(c(0, 1),
                                size = nrow(data_lcmm),
                                replace = TRUE
      ))
    
    
    
    data_lcmm_nest <-
      data_lcmm %>%
      group_by(group_var) %>%
      nest() %>% 
      mutate(data= map(data, as.data.frame))
    
    
    
    # lcmm call from ?lcmm
    
    lcmm_k1 <- function(df) {
      lcmm(Ydep2 ~ Time + I(Time^2),
           random = ~Time, subject = "ID", ng = 1,
           data = data_lcmm_nest$data[[1]], link = "linear"
      )
    }
    
    
    # fit k = 1 models
    data_lcmm_nest <-
      data_lcmm_nest %>%
      mutate(lcgm = map(data, lcmm_k1))
    #> Be patient, lcmm is running ... 
    #> The program took 0.19 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.22 seconds
    
    # set-up output vector
    
    results <- vector(mode = "list", length = nrow(data_lcmm_nest))
    
    # fit models
    
    for(i in 1:nrow(data_lcmm_nest)){
      
      results[[i]] <- gridsearch(
        m = lcmm(Ydep2 ~ Time + I(Time^2),
                 mixture = ~Time,
                 random = ~Time, subject = "ID", ng = 2,
                 data = data_lcmm_nest$data[[i]], link = "linear"
        ),
        rep = 5,
        maxiter = 2,
        minit = data_lcmm_nest$lcgm[[i]]
      )
    }
    #> Be patient, lcmm is running ... 
    #> The program took 0.56 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.42 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.47 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.48 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.52 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.5 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.33 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.32 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.39 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.38 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.37 seconds 
    #> Be patient, lcmm is running ... 
    #> The program took 0.47 seconds
    
    data_lcmm_nest <- 
    data_lcmm_nest %>% 
      ungroup() %>% 
      mutate(res = results)
    

    reprex package (v0.3.0) 于 2021-04-20 创建

    devtools::session_info()
    #> - Session info ---------------------------------------------------------------
    #>  setting  value                       
    #>  version  R version 4.0.3 (2020-10-10)
    #>  os       Windows 10 x64              
    #>  system   x86_64, mingw32             
    #>  ui       RTerm                       
    #>  language (EN)                        
    #>  collate  English_United Kingdom.1252 
    #>  ctype    English_United Kingdom.1252 
    #>  tz       Europe/London               
    #>  date     2021-04-20                  
    #> 
    #> - Packages -------------------------------------------------------------------
    #>  package     * version date       lib source        
    #>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.3)
    #>  callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.3)
    #>  cli           2.2.0   2020-11-20 [1] CRAN (R 4.0.3)
    #>  crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.3)
    #>  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.3)
    #>  devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.3)
    #>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
    #>  dplyr       * 1.0.2   2020-08-18 [1] CRAN (R 4.0.3)
    #>  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.3)
    #>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.3)
    #>  fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.3)
    #>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.3)
    #>  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
    #>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
    #>  highr         0.8     2019-03-20 [1] CRAN (R 4.0.3)
    #>  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.3)
    #>  knitr         1.30    2020-09-22 [1] CRAN (R 4.0.3)
    #>  lattice       0.20-41 2020-04-02 [2] CRAN (R 4.0.3)
    #>  lcmm        * 1.9.2   2020-07-07 [1] CRAN (R 4.0.3)
    #>  lifecycle     0.2.0   2020-03-06 [1] CRAN (R 4.0.3)
    #>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
    #>  Matrix        1.2-18  2019-11-27 [2] CRAN (R 4.0.3)
    #>  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.3)
    #>  pillar        1.4.7   2020-11-20 [1] CRAN (R 4.0.3)
    #>  pkgbuild      1.2.0   2020-12-15 [1] CRAN (R 4.0.3)
    #>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.3)
    #>  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.3)
    #>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.3)
    #>  processx      3.4.5   2020-11-30 [1] CRAN (R 4.0.3)
    #>  ps            1.5.0   2020-12-05 [1] CRAN (R 4.0.3)
    #>  purrr       * 0.3.4   2020-04-17 [1] CRAN (R 4.0.3)
    #>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
    #>  remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.3)
    #>  rlang         0.4.10  2020-12-30 [1] CRAN (R 4.0.3)
    #>  rmarkdown     2.6     2020-12-14 [1] CRAN (R 4.0.3)
    #>  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.0.3)
    #>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
    #>  stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
    #>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.3)
    #>  survival    * 3.2-7   2020-09-28 [1] CRAN (R 4.0.3)
    #>  testthat      3.0.1   2020-12-17 [1] CRAN (R 4.0.3)
    #>  tibble        3.0.4   2020-10-12 [1] CRAN (R 4.0.3)
    #>  tidyr       * 1.1.2   2020-08-27 [1] CRAN (R 4.0.3)
    #>  tidyselect    1.1.0   2020-05-11 [1] CRAN (R 4.0.3)
    #>  usethis       2.0.0   2020-12-10 [1] CRAN (R 4.0.3)
    #>  vctrs         0.3.6   2020-12-17 [1] CRAN (R 4.0.3)
    #>  withr         2.3.0   2020-09-22 [1] CRAN (R 4.0.3)
    #>  xfun          0.20    2021-01-06 [1] CRAN (R 4.0.3)
    #>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.3)
    #> 
    #> [1] M:/R/win-library/3.6
    #> [2] C:/Program Files/R/R-4.0.3/library
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-06-21
      • 1970-01-01
      • 2021-09-28
      • 1970-01-01
      • 1970-01-01
      • 2010-11-26
      • 2018-01-03
      相关资源
      最近更新 更多