如何使用多个变量在tidyverse中使用扩展函数？答案

【问题标题】：How to use the spread function in tidyverse using multiple variables?如何使用多个变量在tidyverse中使用扩展函数？
【发布时间】：2021-03-08 10:36:37
【问题描述】：

我正在尝试在 tidyverse 中对以下数据使用扩展函数的不同方法，但没有成功。目的是为每个 id 1 和 0 为变量中的值提供一个新列：health、ci_high、ci_low。

id  unemployment    health  ci_high ci_low
1   5                 100   110       90
1   10                 80   90        70
1   15                 70   80        60
0   5                  90   100       80
0   10                 50   60        40
0   15                 40   50        30

structure(list(id = structure(c(1, 1, 1, 0, 0, 0), format.stata = "%9.0g"), 
    unemployment = structure(c(5, 10, 15, 5, 10, 15), format.stata = "%9.0g"), 
    health = structure(c(100, 80, 70, 90, 50, 40), format.stata = "%9.0g"), 
    ci_high = structure(c(110, 90, 80, 100, 60, 50), format.stata = "%9.0g"), 
    ci_low = structure(c(90, 70, 60, 80, 40, 30), format.stata = "%9.0g")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

我想得到这样的输出：

unemployment    health_id1  health_id0  ci_high_id1 ci_high_id0 ci_low_id1  ci_low_id0
5                    100        90            110         100        90        80
10                   80         50             90         60         70        40
15                   70         40             80         50         60        30

有人可以指导我吗？

【问题讨论】：

标签： r dplyr tidyverse

【解决方案1】：

使用pivot_wider

pivot_wider(df, unemployment, names_from = id, values_from = c("health", "ci_high", "ci_low"), names_prefix = "id")

# A tibble: 3 x 7
  unemployment health_id1 health_id0 ci_high_id1 ci_high_id0 ci_low_id1 ci_low_id0
         <dbl>      <dbl>      <dbl>       <dbl>       <dbl>      <dbl>      <dbl>
1            5        100         90         110         100         90         80
2           10         80         50          90          60         70         40
3           15         70         40          80          50         60         30

使用data.table

dt <- as.data.table(df)
dcast(dt, unemployment ~ id, value.var = c("health", "ci_high", "ci_low"))

   unemployment health_0 health_1 ci_high_0 ci_high_1 ci_low_0 ci_low_1
1:            5       90      100       100       110       80       90
2:           10       50       80        60        90       40       70
3:           15       40       70        50        80       30       60

【讨论】：