【发布时间】:2021-03-17 12:00:05
【问题描述】:
我想根据type 列将每个record_id 的行合并为一行,但record_id 列中的志愿者在repeat 列中有两次重复。我想要这些的第二行。每个 record_id 对应一个人,该人要么参加过一次(重复=1)或两次测试,因此在 repeat 列中有两个条目。
这是我的数据的样子
structure(list(record_id = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4,
4, 4, 4), type = c(NA, "data_collection", "test", NA, "data_collection",
"test", NA, "data_collection", "test", "test", NA, "cata_collection",
"test", "test"), `repeat` = c(NA, 1, 1, NA, 1, 1, NA, 1, 1, 2,
NA, 1, 1, 2), dt_volunteer_reg = structure(c(1597246320, NA,
NA, 1599217080, NA, NA, 1596184500, NA, NA, NA, 1598192280, NA,
NA, NA), class = c("POSIXct", "POSIXt"), tzone = "UTC"), age = c(26,
NA, NA, 64, NA, NA, 51, NA, NA, NA, 39, NA, NA, NA), gender = c(0,
NA, NA, 1, NA, NA, 0, NA, NA, NA, 1, NA, NA, NA), case_type = c(NA,
1, NA, NA, 2, NA, NA, 1, NA, NA, NA, 1, NA, NA), test_dis_dt = structure(c(NA,
NA, 1597250220, NA, NA, 1600012980, NA, NA, 1596382080, 1601980740,
NA, NA, 1598284020, 1603118700), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), test_dis_res = c(NA, NA, 1, NA, NA, 1, NA,
NA, 2, 2, NA, NA, 2, 2), test_dis_in = c(NA, NA, NA, NA, NA,
0.02, NA, NA, 6.13, 4.75, NA, NA, 7.23, 3.85), test_cont_dt = structure(c(NA,
NA, 1597250280, NA, NA, 1608636120, NA, NA, NA, 1601980740, NA,
NA, 1605704940, 1603205340), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
test_cont_res = c(NA, NA, 2, NA, NA, 1, NA, NA, NA, 2, NA,
NA, 2, 2), test_cont_val = c(NA, NA, 123, NA, NA, 0, NA,
NA, NA, 40000, NA, NA, 471.6, 306.5)), row.names = c(NA,
-14L), class = c("tbl_df", "tbl", "data.frame"))
这就是我希望得到的
structure(list(record_id = c(1, 2, 3, 3, 4, 4), `repeat` = c(1,
1, 1, 2, 1, 2), dt_volunteer_reg = structure(c(1597246320, 1599217080,
1596184500, 1596184500, 1598192280, 1598192280), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), age = c(26, 64, 51, 51, 39, 39), gender = c(0,
1, 0, 0, 1, 1), case_type = c(1, 2, 1, 1, 1, 1), test_dis_dt = structure(c(1597250220,
1600012980, 1596382080, 1601980740, 1598284020, 1603118700), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), test_dis_res = c(1, 1, 2, 2, 2, 2),
test_dis_in = c(NA, 0.02, 6.13, 4.75, 7.23, 3.85), test_cont_dt = structure(c(1597250280,
1608636120, NA, 1601980740, 1605704940, 1603205340), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), test_cont_res = c(2, 1, NA, 2,
2, 2), test_cont_val = c(123, 0, NA, 40000, 471.6, 306.5)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
【问题讨论】:
-
请提供带有
dput的可重现数据集以及您已经尝试过的数据。 -
我已将数据集替换为 dput 输出。我试过使用
spread(dat,type, repeat),但它返回的数据集不变。道歉 - 我对数据争论很陌生
标签: r dplyr concatenation reshape tidyr