【问题标题】:How to create my statistical summary table manually如何手动创建我的统计汇总表
【发布时间】:2023-01-14 00:55:56
【问题描述】:
mydata<-structure(list(Weight = c(66.2, 65.2, 69.8, 63.4, 67.4, 66.3, 
                         63.8, 67.8, 66.7, 66.2, 61.9, 66.9, 69.4, 60.8, 64.1, 62.8, 62.5, 
                         60.9, 61.3, 67.8), Age = c(68, 67, 65, 65, 63, 64, 68, 65, 65, 
                                                    71, 64, 65, 68, 61, 65, 62, 60, 66, 62, 58), 
               Sex = c("H", "H", 
                        "H", "H", "H", "H", "F", "F", "F", "F", "H", "H", "H", "F", "F", 
                        "F", "F", "F", "F", "F"),
               Group = c("G1", "G1", "G1", "G1", 
                          "G1", "G1", "G1", "G1", "G1", "G1", "G2", "G2", "G2", "G2", "G2",
                          "G2", "G2", "G2", "G2", "G2")), row.names = c(NA, -20L), 
          class = "data.frame")

我想通过手动创建表格来汇总我的数据。我的目标是比较两组之间的变量。我不知道有什么软件可以让我以表格格式获得均值和 p 值之差的置信区间。我必须使用 Rmarkdown 以 word 格式导出我的数据,所以我应该以表格格式导出它。

我创建了这样的所有参数:

confInt<-paste(round(t.test(mydata$Weight~mydata$Group)$conf.int[1],2),
               round(t.test(mydata$Weight~mydata$Group)$conf.int[2],2),sep = ";")
p.value<-round(t.test(mydata$Weight~mydata$Group)$p.value,3)

mean1<-mean(mydata$Weight[mydata$Group=="G1"])
mean2<-mean(mydata$Weight[mydata$Group=="G2"])

mean_diff<-(mean(mydata$Weight[mydata$Group=="G1"])-
mean(mydata$Weight[mydata$Group=="G2"]))

目标是通过循环或函数为我的每个数字变量创建这些参数。 首先是变量权重:

然后通过rowbind,绑定每个变量的统计信息

【问题讨论】:

    标签: r automation


    【解决方案1】:

    我们可以创建一个函数来接收数据mydata、数字列col和分组列group

    summary_val <- function(mydata,col,group){
      x <- mydata[[col]]
      group_data <- mydata[[group]]
      
      confInt<-paste(round(t.test(x~group_data)$conf.int[1],2),
                     round(t.test(x~group_data)$conf.int[2],2),sep = ";")
      p.value<-round(t.test(x~group_data)$p.value,3)
      
      mean1<-mean(x[group_data=="G1"])
      mean2<-mean(x[group_data=="G2"])
      
      mean_diff<-(mean(x[group_data=="G1"])-
                    mean(x[group_data=="G2"]))
      diff <- paste0(mean_diff,"[",confInt,"]")
      return(data.frame(var=col,G1=mean1,G2=mean2,`Diff.CI.`=diff,`p.value`=p.value))
    }
    
    summary_val(mydata,"Weight","Group")
    
         var    G1    G2         Diff.CI. p.value
    1 Weight 66.28 63.84 2.44[-0.01;4.89]   0.051
    

    然后我们可以使用以下内容来提取数字列的名称:

    num_var <- names(mydata)[unlist(lapply(mydata, is.numeric))]
    num_var
    [1] "Weight" "Age"
    

    并通过 for 循环获取摘要输出:

    mysummary <- data.frame()
    for(var in num_var){
      mysummary <- rbind(mysummary,summary_val(mydata,var,"Group"))
    }
    mysummary
         var    G1    G2                    Diff.CI. p.value
    1 Weight 66.28 63.84            2.44[-0.01;4.89]   0.051
    2    Age 66.10 63.10 2.99999999999999[0.43;5.57]   0.025
    

    或使用do.call+lapply

    summary_val2 <- function(col,mydata,group){
      summary_val(mydata,col,group)
    }
    
    do.call(rbind,lapply(num_var,summary_val2,mydata,"Group"))
         var    G1    G2                    Diff.CI. p.value
    1 Weight 66.28 63.84            2.44[-0.01;4.89]   0.051
    2    Age 66.10 63.10 2.99999999999999[0.43;5.57]   0.025
    

    【讨论】:

    • 谢谢@peter861222
    猜你喜欢
    • 2021-12-14
    • 2015-07-31
    • 1970-01-01
    • 2018-10-30
    • 1970-01-01
    • 1970-01-01
    • 2014-02-01
    • 1970-01-01
    • 2022-01-14
    相关资源
    最近更新 更多