【发布时间】:2021-02-19 23:12:19
【问题描述】:
我的数据对每个 ID 都有多个观察值。在 ID 级别,我想将所有值转换为最新的非缺失值。我曾尝试使用 mutate、group_by(id) 和 which.max(year) 失败。
数据:
data <- data.frame(
id=c(1,1,2,2,3,3,4,4,5,5),
year=rep(c(2010, 2011), 5),
employ=c("yes", "yes", "no", "yes", "yes", "no", NA, "yes", "no", NA))
> data
id year employ
1 1 2010 yes
2 1 2011 yes
3 2 2010 no
4 2 2011 yes
5 3 2010 yes
6 3 2011 no
7 4 2010 <NA>
8 4 2011 yes
9 5 2010 no
10 5 2011 <NA>
期望的输出:
data2 <- data.frame(
id=c(1,1,2,2,3,3,4,4,5,5),
year=c(2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2010, 2010),
employ=c("yes", "yes", "yes", "yes", "no", "no","yes", "yes","no", "no"))
> data2
id year employ
1 1 2011 yes
2 1 2011 yes
3 2 2011 yes
4 2 2011 yes
5 3 2011 no
6 3 2011 no
7 4 2011 yes
8 4 2011 yes
9 5 2010 no
10 5 2010 no
【问题讨论】:
-
data %>% group_by(id, year) %>% tidyr::fill(employ, .direction = "updown") %>% ungroup() -
所提供的链接并未提供该问题的完整解决方案。我正在尝试用最近可用年份的值替换所有值(缺失和非缺失)。
-
糟糕,意思是
data %>% group_by(id) %>% tidyr::fill(employ, .direction = "updown") %>% ungroup()