【问题标题】:How to use the spread function in tidyverse using multiple variables?如何使用多个变量在tidyverse中使用扩展函数?
【发布时间】:2021-03-08 10:36:37
【问题描述】:

我正在尝试在 tidyverse 中对以下数据使用扩展函数的不同方法,但没有成功。目的是为每个 id 1 和 0 为变量中的值提供一个新列:health、ci_high、ci_low。

id  unemployment    health  ci_high ci_low
1   5                 100   110       90
1   10                 80   90        70
1   15                 70   80        60
0   5                  90   100       80
0   10                 50   60        40
0   15                 40   50        30

structure(list(id = structure(c(1, 1, 1, 0, 0, 0), format.stata = "%9.0g"), 
    unemployment = structure(c(5, 10, 15, 5, 10, 15), format.stata = "%9.0g"), 
    health = structure(c(100, 80, 70, 90, 50, 40), format.stata = "%9.0g"), 
    ci_high = structure(c(110, 90, 80, 100, 60, 50), format.stata = "%9.0g"), 
    ci_low = structure(c(90, 70, 60, 80, 40, 30), format.stata = "%9.0g")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

我想得到这样的输出:

unemployment    health_id1  health_id0  ci_high_id1 ci_high_id0 ci_low_id1  ci_low_id0
5                    100        90            110         100        90        80
10                   80         50             90         60         70        40
15                   70         40             80         50         60        30

有人可以指导我吗?

【问题讨论】:

    标签: r dplyr tidyverse


    【解决方案1】:

    使用pivot_wider

    pivot_wider(df, unemployment, names_from = id, values_from = c("health", "ci_high", "ci_low"), names_prefix = "id")
    
    # A tibble: 3 x 7
      unemployment health_id1 health_id0 ci_high_id1 ci_high_id0 ci_low_id1 ci_low_id0
             <dbl>      <dbl>      <dbl>       <dbl>       <dbl>      <dbl>      <dbl>
    1            5        100         90         110         100         90         80
    2           10         80         50          90          60         70         40
    3           15         70         40          80          50         60         30
    

    使用data.table

    dt <- as.data.table(df)
    dcast(dt, unemployment ~ id, value.var = c("health", "ci_high", "ci_low"))
    
       unemployment health_0 health_1 ci_high_0 ci_high_1 ci_low_0 ci_low_1
    1:            5       90      100       100       110       80       90
    2:           10       50       80        60        90       40       70
    3:           15       40       70        50        80       30       60
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-10-26
      • 1970-01-01
      • 1970-01-01
      • 2023-04-06
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多