【发布时间】:2019-10-19 06:59:02
【问题描述】:
我有一个大型数据集,其中包含许多带有状态的列。我想创建一个包含参与者当前状态的新列。我正在尝试在 dplyr 中使用 case_when,但我不确定如何跨列。数据集的列太多,我无法输入每一列。以下是数据示例:
library(dplyr)
problem <- tibble(name = c("sally", "jane", "austin", "mike"),
status1 = c("registered", "completed", "registered", "no action"),
status2 = c("completed", "completed", "registered", "no action"),
status3 = c("completed", "completed", "withdrawn", "no action"),
status4 = c("withdrawn", "completed", "no action", "registered"))
对于代码,我想要一个新列来说明参与者的最终状态;但是,如果他们的状态 曾经 已完成,那么我希望它说已完成,无论他们的最终状态是什么。对于这些数据,答案如下所示:
answer <- tibble(name = c("sally", "jane", "austin", "mike"),
status1 = c("registered", "completed", "registered", "no action"),
status2 = c("completed", "completed", "registered", "no action"),
status3 = c("completed", "completed", "withdrawn", "no action"),
status4 = c("withdrawn", "completed", "no action", "registered"),
finalstatus = c("completed", "completed", "no action", "registered"))
另外,如果您能对您的代码进行任何解释,我将不胜感激!如果您的解决方案也可以使用 contains("status"),那将特别有用,因为在我的真实数据集中,状态列非常混乱(即 summary_status_5292019、sum_status_07012018 等)。
谢谢!
【问题讨论】:
标签: r string dplyr conditional-statements case-when