【问题标题】:creating summary tables with tidyverse使用 tidyverse 创建汇总表
【发布时间】:2018-03-06 02:41:21
【问题描述】:

Q1.有没有更直接(但仍然是tidyverse)的方法来创建这样的汇总表?

library(tidyverse)
library(knitr)
library(kableExtra)
df <- data.frame(group=c(1, 1, 1, 1, 0, 0, 0, 0),
                 v1=c(1, 2, 3, 4, 5, 6, 1, 2),
                 v2=c(4, 3, 2, 5, 3, 5, 3, 8),
                 v3=c(0, 1, 0, 1, 1, 0, 1, 1))

df %>%
  group_by(group) %>%
  summarise(v1=paste0(round(mean(v1), 2),
                      " (",
                      round(sd(v1), 2),
                      ")"),
            v2=paste0(round(mean(v2), 2),
                      " (",
                      round(sd(v2), 2),
                      ")"),
            v3=round(mean(v3)*100, 1)
  ) %>%
  dplyr::select(-group) %>%
  t() %>%
  `rownames<-` (c("v1 mean (SD)",
                  "v2 mean (SD)",
                   "Percent v3")) %>%
  kable("html",
        col.names=c("Group 0", "Group 1")) %>%
  kable_styling()

Q2.与此相关,有没有办法在不重复summarise代码的情况下将summarise的两个级别(例如,不分组+分组)结合起来?

all <- 
df %>%
  summarise(v1=paste0(round(mean(v1), 2),
                      " (",
                      round(sd(v1), 2),
                      ")"),
            v2=paste0(round(mean(v2), 2),
                      " (",
                      round(sd(v2), 2),
                      ")"),
            v3=round(mean(v3)*100, 1)
  ) %>%
  t() %>%
  `rownames<-` (c("v1 mean (SD)",
                  "v2 mean (SD)",
                   "Percent v3")) 

groups <- 
  df %>%
  group_by(group) %>%
  summarise(v1=paste0(round(mean(v1), 2),
                      " (",
                      round(sd(v1), 2),
                      ")"),
            v2=paste0(round(mean(v2), 2),
                      " (",
                      round(sd(v2), 2),
                      ")"),
            v3=round(mean(v3)*100, 1)
  ) %>%
  dplyr::select(-group) %>%
  t() %>%
  `rownames<-` (c("v1 mean (SD)",
                  "v2 mean (SD)",
                  "Percent v3")) 

all %>%
  cbind(groups) %>%
  kable("html",
        col.names=c("All", "Group 0", "Group 1")) %>%
  kable_styling()

【问题讨论】:

  • 我不确定你在问什么。您显示的代码有什么问题?
  • 就其工作而言没有任何问题,但特别是 Q2 的解决方案似乎效率低下,因为我定义了两次 summarise()

标签: r tidyverse


【解决方案1】:

一种解决方案(特别是如果您想在将来扩展 v1、v2、... .

这将缩短您的“管道”并使其更易于阅读:

... %&gt;% summarise(v1 = paste_mean_and_sd(v1), v2 = paste_mean_and_sd(v2), v3=round(mean(v3)*100, 1)) %&gt;% ...

【讨论】:

    【解决方案2】:

    这是我能想到的最低限度。

    cat_var <- "v3"
    
    df_cal <- function(x, var) {
      if (var[1] %in% cat_var) return(as.character(round(mean(x), 1)))
      paste0(mean(x), " (", round(sd(x), 2), ")")
    }
    
    df_tall <- df %>% gather(var, x, v1:v3) %>% group_by(var)
    
    all <- df_tall %>% summarise(stat = df_cal(x, var)) %>% mutate(group = -1)
    groups <- df_tall %>% group_by(group, var) %>% summarise(stat = df_cal(x, var)) 
    
    bind_rows(all, groups) %>%
      ungroup() %>%
      mutate(var = factor(var, labels = c(
        "v1 mean (SD)", "v2 mean (SD)", "Precent v3"
      ))) %>%
      spread(group, stat) %>%
      kable("html", col.names = c(" ", "All", "Group 0", "Group 1")) %>%
      kable_styling()
    

    【讨论】:

      猜你喜欢
      • 2020-11-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2023-04-03
      • 2021-08-23
      • 1970-01-01
      相关资源
      最近更新 更多