【发布时间】:2019-08-15 00:06:30
【问题描述】:
假设我们有以下数据框:
ID <- c(1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6, 6)
age <- c(25, 25, 25, 22, 22, 56, 56, 56, 80, 33, 33, 90, 90, 90, 5, 5, 5)
gender <- c("m", "m", NA, "f", "f", "m", NA, "m", "m", "m", NA, NA, NA, "m", NA, NA, NA)
company <- c("c1", "c2", "c2", "c3", "c3", "c1", "c1", "c1", "c1", "c5", "c5", "c3", "c4", "c5", "c3", "c1", "c1")
income <- c(1000, 1000, 1000, 500, 1700, 200, 200, 250, 500, 700, 700, 300, 350, 300, 500, 1700, 200)
df <- data.frame(ID, age, gender, company, income)
在此数据中,我们有 6 个唯一的 IDs,如果您查看 gender 变量,有时会包含 NA
我想用正确的性别类别替换 NAs。此外,如果一个 ID 包含所有NA 的性别,则保持原样。
预期的结果是:
【问题讨论】:
-
你可以使用
fill,df %>% group_by(age) %>% fill(gender) %>% fill(gender, .direction = "up")
标签: r