【发布时间】:2022-02-07 01:23:25
【问题描述】:
考虑以下随机 MWE。
对于每一行,我试图确定哪个变量的值最接近常量reference_day,哪个变量的值最接近常量reference_day。 p>
df1 <-
structure(
list(id = 1:5,
gender = c("female", "male", "male", "male", "male"),
reference_day = structure(c(18052, NA, 18052, 18052, 18052), class = "Date"),
var1 = structure(c(16505, 17144, 18139, NA, 16639), class = "Date"),
var2 = structure(c(NA, 18042, 16544, 16697, NA), class = "Date"),
var3 = structure(c(17845, 18070, 17152, 16571, NA), class = "Date")),
row.names = c(NA, -5L), class = "data.frame")
df1
id gender reference_day var1 var2 var3
1 1 female 2019-06-05 2015-03-11 <NA> 2018-11-10
2 2 male <NA> 2016-12-09 2019-05-26 2019-06-23
3 3 male 2019-06-05 2019-08-31 2015-04-19 2016-12-17
4 4 male 2019-06-05 <NA> 2015-09-19 2015-05-16
5 5 male 2019-06-05 2015-07-23 <NA> <NA>
我想要的结果是这样的:
id gender reference_day var1 var2 var3 closest_to_left closest_to_right
1 1 female 2019-06-05 2015-03-11 <NA> 2018-11-10 var3 <NA>
2 2 male <NA> 2016-12-09 2019-05-26 2019-06-23 <NA> <NA>
3 3 male 2019-06-05 2019-08-31 2015-04-19 2016-12-17 var3 var1
4 4 male 2019-06-05 <NA> 2015-09-19 2015-05-16 var2 <NA>
5 5 male 2019-06-05 2015-07-23 <NA> <NA> var1 <NA>
经过多次试验和错误,我实际上能够使用 dplyr 的 case_when 函数找到解决方案,但它需要大量的样板代码,这让我认为必须有一个更聪明的解决方案。
我个人更喜欢使用 dplyr,但非常感谢任何帮助。
【问题讨论】: