【发布时间】:2019-06-18 13:56:52
【问题描述】:
我正在尝试使用嵌套数据框 (https://r4ds.had.co.nz/many-models.html) 方法使用 lcmm::lcmm() 和 purrr::pmap() 拟合多个潜在类增长曲线。
此过程需要使用 lcmm() 拟合具有一个类 (k = 1) 的模型,然后将此模型用作 lcmm::gridsearch() 的输入,它会从此 k = 1 模型输入到 k = 2+ 类模型中。 gridsearch() 还需要对 k = 2+ 模型的模型调用(加上两个其他参数),它在对 gridsearch() 的调用中作为对 lcmm() 的调用传递。我通常的方法是使用pmap() 将参数列表传递给gridsearch(),但list() 立即评估对lcmm() 的模型调用并尝试拟合模型而不是将模型调用传递给gridsearch() (见confusing behavior of purrr::pmap with rlang; "to quote" or not to quote argument that is the Q)。
NB 使用 RStudio 的函数查看器 (F2),似乎 lcmm::gridsearch() 使用 match.call() 来调整具有用户定义数量的随机起始值的 k = 2+ 模型调用,并且然后遍历这些以找到首选的 k = 2+ 解决方案。
我在下面包含了一个代表。在 pmap 中包装对 gridsearch 的调用时,命令失败并显示“mutate_impl(.data, dots) 中的错误:评估错误:参数的长度为零。” - 我认为这是因为 R 试图评估 k = 2+ 模型对 lcmm() 的调用,但我可能是错的。
当作为参数传递给pmap() 时,如何延迟对lcmm() 的评估?
下面的例子:
library(lcmm)
#> Warning: package 'lcmm' was built under R version 3.5.2
#> Loading required package: survival
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(purrr)
# load lcmm example data
data("data_lcmm")
# take sample
set.seed(123)
data_lcmm <-
data_lcmm %>%
sample_frac(0.1)
# NB grouping variable is needed to reproduce desired data structure
data_lcmm <-
data_lcmm %>%
mutate(group_var = sample(c(0, 1),
size = nrow(data_lcmm),
replace = TRUE
))
data_lcmm_nest <-
data_lcmm %>%
group_by(group_var) %>%
nest() %>%
mutate(data= map(data, as.data.frame))
# lcmm call from ?lcmm
lcmm_k1 <- function(df) {
lcmm(Ydep2 ~ Time + I(Time^2),
random = ~Time, subject = "ID", ng = 1,
data = data_lcmm_nest$data[[1]], link = "linear"
)
}
# fit k = 1 models
data_lcmm_nest <-
data_lcmm_nest %>%
mutate(lcgm = map(data, lcmm_k1))
#> Be patient, lcmm is running ...
#> The program took 0.18 seconds
#> Be patient, lcmm is running ...
#> The program took 0.19 seconds
# this works for a single row
desired_result <-
gridsearch(
m = lcmm(Ydep2 ~ Time + I(Time^2),
mixture = ~Time,
random = ~Time, subject = "ID", ng = 2,
data = data_lcmm_nest$data[[1]], link = "linear"
),
rep = 5,
maxiter = 2,
minit = data_lcmm_nest$lcgm[[1]]
)
#> Be patient, lcmm is running ...
#> The program took 0.45 seconds
#> Be patient, lcmm is running ...
#> The program took 0.45 seconds
#> Be patient, lcmm is running ...
#> The program took 0.45 seconds
#> Be patient, lcmm is running ...
#> The program took 0.45 seconds
#> Be patient, lcmm is running ...
#> The program took 0.47 seconds
#> Be patient, lcmm is running ...
#> The program took 0.61 seconds
# this fails with Error in mutate_impl(.data, dots) :
# Evaluation error: argument is of length zero.
data_lcmm_nest %>%
mutate(lcgm_2 = pmap(
list(
m = lcmm(Ydep2 ~ Time + I(Time^2),
mixture = ~Time,
random = ~Time, subject = "ID", ng = 2,
data = data, link = "linear"
),
rep = 5,
maxiter = 2,
minit = lcgm
), gridsearch
))
#> Error in mutate_impl(.data, dots): Evaluation error: argument is of length zero.
# wrapping gridsearch in helper also fails
grid_search_helper <- function(g_rep, g_maxiter, g_minit, g_m) {
gridsearch(
m = lcmm(Ydep2 ~ Time + I(Time^2),
mixture = ~Time,
random = ~Time, subject = "ID", ng = 2,
data = g_m, link = "linear"
),
rep = g_rep,
maxiter = g_maxiter,
minit = g_minit
)
}
data_lcmm_nest %>%
mutate(lcgm_2 = pmap(
list(
5,
2,
lcgm,
data
), grid_search_helper
))
#> Error in mutate_impl(.data, dots): Evaluation error: object 'g_m' not found.
由reprex package (v0.2.1) 于 2019 年 1 月 24 日创建
【问题讨论】: