【问题标题】:way to create summary table in R with two grouping factors使用两个分组因子在 R 中创建汇总表的方法
【发布时间】:2020-10-22 19:26:49
【问题描述】:

我正在尝试根据以下数据在 R 中创建一个包含两个分组因素的汇总表

   species        plot   type  `mean(C)`
   <fct>        <fct> <fct>     <dbl>
 1 CA          MI  A         -35.7
 2 CA          MI  B         -35.6
 3 CA          MI  C         -35.9
 4 FO          MI  A         -35.7
 5 FO          MI  B         -34.9
 6 FO          MI  C         -35.3
 7 HE          MI  A         -35.4
 8 HE          MI  B         -35.6
 9 HE          MI  C         -35.6
10 LA          MI  A         -36.5

mean(C) 是我希望显示的响应变量,我希望根据类型和物种对其进行拆分;即类型为列,物种为行

我尝试使用的每个软件包(xtable、stargazer、gtsummary)似乎都没有能力做到这一点。当然,我可以自己插电,但很高兴知道是否有包裹。有人有什么想法吗?

非常感谢

【问题讨论】:

  • 请注明使用的包
  • pivot_wider

标签: r data-visualization gtsummary


【解决方案1】:

这是使用 tidyr:: 包的一种解决方案。

library(tidyr)
# define your data as dataframe df
df <- data.frame("species" = c("CA","CA","CA","FO","FO", "FO", "HE", "HE", "HE", "LA"),
                         "plot" = c("MI"),
                         "type" = c("A", "B", "C", "A", "B", "C", "A", "B", "C", "A"),
                         "mean" = c(-35.7, -35.6, -35.9, -35.7, -34.9, -35.3, -35.4, -35.6, -35.6, -36.5))
# pivot df around 'type', using 'mean'   
df %>%
  pivot_wider(names_from = type, values_from = mean)

这会返回:

> df %>%
+   pivot_wider(names_from = type, values_from = mean)
# A tibble: 4 x 5
species plot      A     B     C   
<fct>   <fct> <dbl> <dbl> <dbl> 
1 CA      MI    -35.7 -35.6 -35.9 
2 FO      MI    -35.7 -34.9 -35.3 
3 HE      MI    -35.4 -35.6 -35.6 
4 LA      MI    -36.5  NA    NA

【讨论】:

    【解决方案2】:

    如果您只想将类型变量值作为列呈现的数据框,您只需从 tidyverse 中寻找 spread 函数。

    library(tidyr)
    
    df <- data.frame(species = c("CA", "CA", "CA", "FO", "FO", "FO", "HE", "HE", "HE", "LA"),
                     plot = "MI", 
                     type = c("A", "B", "C", "A", "B", "C", "A", "B", "C", "A"),
                     mean_C = c(-35.7,-35.6,-35.9, -35.7,-34.9,-35.3, -35.4,-35.6, -35.6,-36.5))
    
    
    new_df <- df %>%
      spread(type, mean_C)
    
    print(new_df)
    

    【讨论】:

      【解决方案3】:

      您可以在 {gtsummary} 中使用tbl_strata() 添加任意数量的分层变量。下面的例子!

      library(gtsummary)
      packageVersion("gtsummary")
      #> [1] '1.5.0'
      
      tbl <-
        trial %>%
        mutate(grade = paste("Grade", grade)) %>%
        tbl_strata(
          strata = grade,
          ~ .x %>%
            tbl_summary(
              by = trt, 
              include = c(age, response),
              missing = "no"
            )
        )
      

      reprex package (v2.0.1) 于 2022-01-07 创建

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2017-04-24
        • 1970-01-01
        • 1970-01-01
        • 2016-04-07
        • 1970-01-01
        • 2018-11-27
        • 1970-01-01
        相关资源
        最近更新 更多