【发布时间】:2018-11-03 17:47:43
【问题描述】:
输入
我有一个如下的数据框:
structure(list(DistalLESfromnarescm = c("31.9", "31.9", "33.1",
"33.3", "33.8", "34.0"), LESmidpointfromnarescm = c("31.2", "31.2",
"32.0", "32.0", "33.1", "33.2"), ProximalLESfromnarescm = c("30.1",
"30.1", "30.9", "30.9", "31.8", "31.9"), LESlengthcm = c("1.8",
"1.8", "2.2", "2.5", "2.0", "2.1"), EsophageallengthLESUEScenterscm = c("12.1",
"12.1", "14.0", "15.0", "15.1", NA), PIPfromnarescm = c("37.8",
"37.8", "No", "No", "34.3", "35.8"), Hosp_Id = c("A", "A", "B",
"B", "C", "D")), .Names = c("DistalLESfromnarescm", "LESmidpointfromnarescm",
"ProximalLESfromnarescm", "LESlengthcm", "EsophageallengthLESUEScenterscm",
"PIPfromnarescm", "Hosp_Id"), row.names = c(NA, -6L), class = "data.frame")
瞄准
如果出现以下情况,我想将任意行中的值与前一行合并: a) 医院号码相同,并且 b)分组行之间的特定列中的值不同
我遇到的问题是如何在dplyr 中使用lapply,因为我不知道在 lapply 语句的左侧要引用什么。
尝试 1
result2 <- Question %>%
group_by(HospNum_Id,DistalLESfromnarescm)%>%
ifelse(HospNum_Id==lag(HospNum_Id),
lapply(WHAT DO I REFER TO HERE function(x) ifelse(x==lag(x), x,paste0(x,"::",lead(x)),"No")),"No")
期望的输出
structure(list(DistalLESfromnarescm = c("31.9",
"33.1:33.3", "33.8", "34.0"), LESmidpointfromnarescm = c("31.2",
"32.0", "33.1", "33.2"), ProximalLESfromnarescm = c(
"30.1", "30.9", "31.8", "31.9"), LESlengthcm = c(
"1.8", "2.2:2.5", "2.0", "2.1"), EsophageallengthLESUEScenterscm = c(
"12.1", "14.0:15.0", "15.1", NA), PIPfromnarescm = c(
"37.8", "No", "34.3", "35.8"), Hosp_Id = c( "A",
"B", "C", "D")), .Names = c("DistalLESfromnarescm", "LESmidpointfromnarescm",
"ProximalLESfromnarescm", "LESlengthcm", "EsophageallengthLESUEScenterscm",
"PIPfromnarescm", "Hosp_Id"), row.names = c(NA, -4L), class = "data.frame")
【问题讨论】:
-
感谢您提供意见。现在,如果您可以提供示例输出,您将有一个完整描述的问题!
-
@De Novo 按要求输出
-
另见这个 [r-faq] 总结几个(所有)变量,也使用
dplyr:Aggregate / summarize multiple variables per group (e.g. sum, mean)