【发布时间】:2020-09-02 21:42:21
【问题描述】:
这是一个代码块,用于复制我遇到错误消息的一些数据:
# Set up
library(tidyverse)
library(lubridate)
library(foreach)
# Create data
mydf <- data.frame(
cohort = seq(ymd('2019-01-01'), ymd('2019-12-31'), by = '1 days'),
n = rnorm(365, 1000, 50) %>% round,
cohort_cost = rnorm(365, 800, 50)
) %>%
crossing(tenure_days = 0:365) %>%
mutate(activity_date = cohort + days(tenure_days)) %>%
mutate(daily_revenue = rnorm(nrow(.), 20, 1)) %>%
group_by(cohort) %>%
arrange(activity_date) %>%
mutate(cumulative_revenue = cumsum(daily_revenue)) %>%
arrange(cohort, activity_date) %>%
mutate(payback_velocity = round(cumulative_revenue / cohort_cost, 2)) %>%
select(cohort, n, cohort_cost, activity_date, tenure_days, everything())
## wider data
mydf_wide <- mydf %>%
select(cohort, n, cohort_cost, tenure_days, payback_velocity) %>%
group_by(cohort, n, cohort_cost) %>%
pivot_wider(names_from = tenure_days, values_from = payback_velocity, names_prefix = 'velocity_day_') %>%
mutate(Category = rep(LETTERS[1:3], length.out = n()))
models <- data.frame(
from = mydf$tenure_days %>% unique,
to = mydf$tenure_days %>% unique
) %>%
expand.grid %>%
filter(to > from) %>%
filter(from > 0) %>%
arrange(from) %>%
mutate(mod_formula = paste0('velocity_day_', to, ' ~ velocity_day_', from)) %>%
mutate(Category = rep(LETTERS[1:3], length.out = n()))
model_splits <- models %>% split(.$Category)
我有一个数据框,每一行都包含一个模型规范,我想在其中将模型拟合为变异字段。
运行上面的块后,我正在使用的结果数据如下所示:
model_splits$A %>% glimpse
Rows: 22,144
Columns: 4
$ from <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ to <int> 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50, 53, 56, 59, 62, 65, 68, 71, 74, 77, 80, 83, 86, 89, 92, 95, 98, 101, 104, 107, 110,…
$ mod_formula <chr> "velocity_day_2 ~ velocity_day_1", "velocity_day_5 ~ velocity_day_1", "velocity_day_8 ~ velocity_day_1", "velocity_day_11 ~ velocity_day_1", "veloci…
$ Category <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A"…
即包含模型规范的数据框。我也有如下所示的数据框 mydf_wide:
mydf_wide %>% head()
# A tibble: 6 x 370
# Groups: cohort, n, cohort_cost [6]
cohort n cohort_cost velocity_day_0 velocity_day_1 velocity_day_2 velocity_day_3 velocity_day_4 velocity_day_5 velocity_day_6 velocity_day_7 velocity_day_8
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2019-01-01 884 723. 0.03 0.05 0.08 0.11 0.14 0.17 0.19 0.22 0.25
2 2019-01-02 1026 698. 0.03 0.06 0.09 0.12 0.15 0.17 0.2 0.23 0.26
3 2019-01-03 911 906. 0.02 0.04 0.07 0.09 0.11 0.13 0.15 0.18 0.2
4 2019-01-04 893 828. 0.02 0.05 0.07 0.1 0.12 0.15 0.17 0.2 0.22
5 2019-01-05 924 821. 0.02 0.05 0.07 0.1 0.12 0.15 0.17 0.2 0.22
6 2019-01-06 1032 797. 0.02 0.05 0.08 0.1 0.13 0.15 0.18 0.2 0.23
在一个循环中,我想遍历model_splits,并且在每种情况下,使用 map 来拟合模型:
# fit some models in a loop
foreach::foreach(c = model_splits %>% names, .combine='c') %do% {
df <- model_splits[[c]] %>%
sample_n(3) %>%
mutate(Model = map(.x = mod_formula, ~lm(.x, data = mydf_wide %>% filter(Category == c))))
}
Error in { : task 2 failed - "Problem with `mutate()` input `Model`.
x 0 (non-NA) cases
ℹ Input `Model` is `map(...)`."
我做了一些谷歌搜索,建议我寻找 NA 值,但上面的 DF 都没有丢失数据。
期望的结果是在每一行数据框模型上变异的新模型。
【问题讨论】:
-
有些情况下
filtermydf_wide %>% filter(Category == names(model_splits)[[3]])中有0行