在 {gtsummary} 中对重叠的分类变量进行排序答案

【问题标题】：Sorting overlapping categorical variables in {gtsummary}在 {gtsummary} 中对重叠的分类变量进行排序
【发布时间】：2022-01-15 01:17:02
【问题描述】：

require(gtsummary)

test <- structure(list(`1` = c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0), `2` = c(1,0, 0, 0, 0, 1, 0, 1, 0, 0), `3` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `4` = c(1, 1, 0, 0, 1, 0, 0, 0, 0, 0), `5` = c(1, 0, 1, 1,0, 1, 1, 0, 0, 0), `6` = c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0), `7` = c(0,0, 0, 0, 0, 0, 0, 0, 0, 0), `8` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `9` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `10` = c(0, 0, 0,0, 0, 0, 0, 0, 0, 1)), row.names = c(NA, -10L), class = c("tbl_df","tbl", "data.frame"))

在这个示例数据中，我有 10 个分类变量。

     `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1     0     1     0     1     1     0     0     0     0     0
 2     0     0     0     1     0     0     0     0     0     0
 3     0     0     0     0     1     0     0     0     0     0
 4     0     0     0     0     1     1     0     0     0     0
 5     0     0     0     1     0     0     0     0     0     0
 6     0     1     0     0     1     0     0     0     0     0
 7     0     0     0     0     1     1     0     0     0     0
 8     0     1     0     0     0     0     0     0     0     0
 9     1     0     0     0     0     0     0     0     0     0
10     0     0     0     0     0     0     0     0     0     1

由于它们可以相互重叠，因此我将它们放在不同的列中，使用 0 和 1，表示“是”或“否”有（或没有）分类变量。

当test %>% tbl_summary() 时，它会创建：

我想按频率排序，但是

test %>% tbl_summary(sort = list(everything() ~ "frequency"))

不工作。

有没有办法做到这一点？提前谢谢你。

【问题讨论】：

标签： r dplyr gtsummary

【解决方案1】：

tbl_summary(sort=) 参数对变量中的级别进行排序，而不是变量在表中出现的顺序。变量在表中的出现顺序与它们在数据框中出现的顺序相同。

我们可以使用下面的代码更新数据框中的顺序。

library(gtsummary)
#> #Uighur
packageVersion("gtsummary")
#> [1] '1.5.0'

test <- structure(list(`1` = c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0), `2` = c(1,0, 0, 0, 0, 1, 0, 1, 0, 0), `3` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `4` = c(1, 1, 0, 0, 1, 0, 0, 0, 0, 0), `5` = c(1, 0, 1, 1,0, 1, 1, 0, 0, 0), `6` = c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0), `7` = c(0,0, 0, 0, 0, 0, 0, 0, 0, 0), `8` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `9` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `10` = c(0, 0, 0,0, 0, 0, 0, 0, 0, 1)), row.names = c(NA, -10L), class = c("tbl_df","tbl", "data.frame")) 

# order variables by prevelence 
prev <- purrr::map_dbl(test, mean) %>% sort(decreasing = TRUE)

test %>%
  select(all_of(names(prev))) %>%
  tbl_summary() %>%
  as_kable() # convert to kable for SO

Characteristic	N = 10
5	5 (50%)
2	3 (30%)
4	3 (30%)
6	2 (20%)
1	1 (10%)
10	1 (10%)
3	0 (0%)
7	0 (0%)
8	0 (0%)
9	0 (0%)

^{由reprex package (v2.0.1) 于 2021 年 12 月 10 日创建}

【讨论】：

丹尼尔，“#> #Uighur”是什么？哦，我没有。
这是 gtsummary 中的启动消息，表示支持#Uighur