【发布时间】:2019-07-17 00:36:25
【问题描述】:
嗨, 我在 A 列中有这个具有唯一 Id 变量的数据集,然后是每个患者的后续肾脏扫描。这是一个 csv 文件,如果可能的话,我想使用 R 将其重塑为长格式。 每个参与者可以进行 1-17 次的肾脏扫描。
还有一些 ID 被列为“否”,因为没有接收到扫描。 我希望它被重新塑造成类似的东西
我知道以前按年份组织的有关此组织的问题,我有来自参与者的扫描,这些扫描在年份日期格式 yyyy-mm-dd 中出现多次
请看下面的数据
structure(list(id = c(1010001, 1010002, 1010004, 1010005, 1010006,
1010007), `GFR Scans?` = c("Yes", "Yes", "Yes", "Yes", "Yes",
"No"), `1. Date of renal scan:` = structure(c(1133913600, 1196812800,
1237334400, 1124150400, 1192060800, NA), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), `1. Type of renal scan:` = c("DTPA",
"DTPA", "DTPA", "DTPA", "DTPA", NA), `1. GFR mL/1.73 sq.m` = c(18,
13, 68, 117, 46, NA), `1. Pre/Post tx?` = c("Pre", "Pre", "Post",
"Post", "Pre", NA), `2. Date of renal scan:` = structure(c(1146528000,
1214524800, NA, 1151366400, 1245974400, NA), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), `2. Type of renal scan:` = c("DTPA",
"DTPA", NA, "DTPA", "DTPA", NA), `2. GFR mL/1.73 sq.m` = c(86,
110, NA, 148, 123, NA), `2. Pre/Post tx?` = c("Post", "Post",
NA, "Post", "Post", NA), `3. Date of renal scan:` = structure(c(NA,
1219104000, NA, 1184025600, NA, NA), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), `3. Type of renal scan:` = c(NA, "DTPA", NA,
"DTPA", NA, NA), `3. GFR mL/1.73 sq.m` = c(NA, 92, NA, 166, NA,
NA), `3. Pre/Post tx?` = c(NA, "Post", NA, "Post", NA, NA), `4. Date of renal scan:` = structure(c(NA,
1242691200, NA, 1213660800, NA, NA), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), `4. Type of renal scan:` = c(NA, "DTPA", NA,
"DTPA", NA, NA), `4. GFR mL/1.73 sq.m` = c(NA, 36, NA, 171, NA,
NA), `4. Pre/Post tx?` = c(NA, "Post", NA, "Post", NA, NA), `5. Date of renal scan:` = structure(c(NA,
NA, NA, 1288656000, NA, NA), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
`5. Type of renal scan:` = c(NA, NA, NA, "DTPA", NA, NA),
`5. GFR mL/1.73 sq.m` = c(NA, NA, NA, 105, NA, NA), `5. Pre/Post tx?` = c(NA,
NA, NA, "Post", NA, NA), `6. Date of renal scan:` = structure(c(NA,
NA, NA, 1323129600, NA, NA), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), `6. Type of renal scan:` = c(NA, NA, NA,
"DTPA", NA, NA), `6. GFR mL/1.73 sq.m` = c(NA, NA, NA, 103,
NA, NA), `6. Pre/Post tx?` = c(NA, NA, NA, "Post", NA, NA
), `7. Date of renal scan:` = structure(c(NA, NA, NA, 1355184000,
NA, NA), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
`7. Type of renal scan:` = c(NA, NA, NA, "DTPA", NA, NA),
`7. GFR mL/1.73 sq.m` = c(NA, NA, NA, 98, NA, NA), `7. Pre/Post tx?` = c(NA,
NA, NA, "Post", NA, NA), `8. Date of renal scan:` = c(NA,
NA, NA, NA, NA, NA), `8. Type of renal scan:` = c(NA, NA,
NA, NA, NA, NA), `8. GFR mL/1.73 sq.m` = c(NA, NA, NA, NA,
NA, NA), `8. Pre/Post tx?` = c(NA, NA, NA, NA, NA, NA), `9. Date of renal scan:` = c(NA,
NA, NA, NA, NA, NA), `9. Type of renal scan:` = c(NA, NA,
NA, NA, NA, NA), `9. GFR mL/1.73 sq.m` = c(NA, NA, NA, NA,
NA, NA), `9. Pre/Post tx?` = c(NA, NA, NA, NA, NA, NA), `10. Date of renal scan:` = c(NA,
NA, NA, NA, NA, NA), `10. Type of renal scan:` = c(NA, NA,
NA, NA, NA, NA), `10. GFR mL/1.73 sq.m` = c(NA, NA, NA, NA,
NA, NA), `10. Pre/Post tx?` = c(NA, NA, NA, NA, NA, NA),
`11. Date of renal scan:` = c(NA, NA, NA, NA, NA, NA), `11. Type of renal scan:` = c(NA,
NA, NA, NA, NA, NA), `11. GFR mL/1.73 sq.m` = c(NA, NA, NA,
NA, NA, NA), `11. Pre/Post tx?` = c(NA, NA, NA, NA, NA, NA
), `12. Date of renal scan:` = c(NA, NA, NA, NA, NA, NA),
`12. Type of renal scan:` = c(NA, NA, NA, NA, NA, NA), `12. GFR mL/1.73 sq.m` = c(NA,
NA, NA, NA, NA, NA), `12. Pre/Post tx?` = c(NA, NA, NA, NA,
NA, NA), `13. Date of renal scan:` = c(NA, NA, NA, NA, NA,
NA), `13. Type of renal scan:` = c(NA, NA, NA, NA, NA, NA
), `13. GFR mL/1.73 sq.m` = c(NA, NA, NA, NA, NA, NA), `13. Pre/Post tx?` = c(NA,
NA, NA, NA, NA, NA), `14. Date of renal scan:` = c(NA, NA,
NA, NA, NA, NA), `14. Type of renal scan:` = c(NA, NA, NA,
NA, NA, NA), `14. GFR mL/1.73 sq.m` = c(NA, NA, NA, NA, NA,
NA), `14. Pre/Post tx?` = c(NA, NA, NA, NA, NA, NA), `15. Date of renal scan:` = c(NA,
NA, NA, NA, NA, NA), `15. Type of renal scan:` = c(NA, NA,
NA, NA, NA, NA), `15. GFR mL/1.73 sq.m` = c(NA, NA, NA, NA,
NA, NA), `15. Pre/Post tx?` = c(NA, NA, NA, NA, NA, NA),
`16. Date of renal scan:` = c(NA, NA, NA, NA, NA, NA), `16. Type of renal scan:` = c(NA,
NA, NA, NA, NA, NA), `16. GFR mL/1.73 sq.m` = c(NA, NA, NA,
NA, NA, NA), `16. Pre/Post tx?` = c(NA, NA, NA, NA, NA, NA
), `17. Date of renal scan:` = c(NA, NA, NA, NA, NA, NA),
`17. Type of renal scan:` = c(NA, NA, NA, NA, NA, NA), `17. GFR mL/1.73 sq.m` = c(NA,
NA, NA, NA, NA, NA), `17. Pre/Post tx?` = c(NA, NA, NA, NA,
NA, NA)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
第一张图片是原始的宽格式,第二张图片是我想要得到的。由于我涉及多个专栏,因此没有其他关于此的广泛到冗长的答案对我有帮助。
例如id 1010001 已经进行了两次扫描,我需要一个接一个地列出,而不是放在一起(见图二)。
非常感谢您的帮助。
【问题讨论】:
-
所以思路是把表排序成ID,把第二、第三组移到第一组?
-
是的,按 ID 分组,然后在下面列出后续扫描,而不是并排。一些 ID 有多达 17 次扫描(侧面的列)。
-
还有一些 ID 没有收到任何扫描 - 列为否。这些也需要列出,这些只有一行,因为没有后续链接的列
标签: python r excel dataframe reshape