如何在 R 中格式化 describeBy 表？答案

【问题标题】：How to format describeBy table in R?如何在 R 中格式化 describeBy 表？
【发布时间】：2016-10-28 00:05:15
【问题描述】：

我有这个数据集：

Defects.I    Defects.D       Treatment
    1           2               A
    1           3               B

我正在尝试对检测到和隔离的缺陷进行描述性统计，并按治疗进行分组。在搜索了一段时间后，我在 psych 库中找到了一个不错的函数，名为 describeBy()。使用以下代码：

describeBy(myData[1:2],myData$Treatment)

我得到了这个输出：

Treatment A        
                  Mean.   Median.    Trimed.
    Defects.I       x        x          x
    Defects.D       x        x          x

Treatment B        
                  Mean.   Median.    Trimed.
    Defects.I       x        x          x
    Defects.D       x        x          x

但实际上我正在寻找类似的东西

                  Mean.   Median.    Trimed.
                  A  B     A  B       A  B
    Defects.I     x  x     x  x       x  x 
    Defects.D     x  x     x  x       x  x

数据

myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3, Treatment = c("A", 
"B")), .Names = c("Defects.I", "Defects.D", "Treatment"), class = "data.frame", row.names = c(NA, 
-2L))

【问题讨论】：

l <- psych::describeBy(myData[1:2], myData$Treatment); do.call('cbind', l)[, order(sequence(lengths(l)))]
@rawr 这正是我一直在寻找的！您可以将其发布为答案吗？如果你放 1 或 2 个 cmets 那就太棒了:)

标签： r formatting psych

【解决方案1】：

由于describeBy 返回一个数据帧列表，我们可以只使用cbind 它们全部，但这并没有得到正确的顺序。相反，我们可以交错列

myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3,
                         Treatment = c("A", "B")),
                    .Names = c("Defects.I", "Defects.D", "Treatment"),
                    class = "data.frame", row.names = c(NA, -2L))

l <- psych::describeBy(myData[1:2], myData$Treatment)

所以使用这个顺序交错

order(sequence(c(ncol(l$A), ncol(l$B))))
# [1]  1 14  2 15  3 16  4 17  5 18  6 19  7 20  8 21  9 22 10 23 11 24 12 25 13 26

而不是 cbind 一个人会做什么

c(1:13, 1:13)
# [1]  1  2  3  4  5  6  7  8  9 10 11 12 13  1  2  3  4  5  6  7  8  9 10 11 12 13

所以这个

do.call('cbind', l)[, order(sequence(lengths(l)))]
#           A.vars B.vars A.n B.n A.mean B.mean A.sd B.sd A.median B.median A.trimmed B.trimmed A.mad B.mad
# Defects.I      1      1   1   1      1      1   NA   NA        1        1         1         1     0     0
# Defects.D      2      2   1   1      2      3   NA   NA        2        3         2         3     0     0
#           A.min B.min A.max B.max A.range B.range A.skew B.skew A.kurtosis B.kurtosis A.se B.se
# Defects.I     1     1     1     1       0       0     NA     NA         NA         NA   NA   NA
# Defects.D     2     3     2     3       0       0     NA     NA         NA         NA   NA   NA

或作为函数

interleave <- function(l, how = c('cbind', 'rbind')) {
  how <- match.arg(how)
  if (how %in% 'rbind')
    do.call(how, l)[order(sequence(sapply(l, nrow))), ]
  else do.call(how, l)[, order(sequence(sapply(l, ncol))), ]
}

interleave(l)
#           A.vars B.vars A.n B.n
# Defects.I      1      1   1   1
# Defects.D      2      2   1   1 ...
# ...

interleave(l, 'r')
#             vars n mean sd median trimmed mad min max range skew kurtosis se
# A.Defects.I    1 1    1 NA      1       1   0   1   1     0   NA       NA NA
# B.Defects.I    1 1    1 NA      1       1   0   1   1     0   NA       NA NA
# A.Defects.D    2 1    2 NA      2       2   0   2   2     0   NA       NA NA
# B.Defects.D    2 1    3 NA      3       3   0   3   3     0   NA       NA NA

【讨论】：

感谢您的回答！只需一个小问题，您可以选择要显示的统计数据吗？
@p3rand0r 似乎有一些选项从describeBy 传递给describe，所以你可以做describeBy(..., skew = FALSE) 来停止偏度/峰度，但我可能只是在@987654332 之后设置子集@
太棒了！这些实际上是我试图删除的，谢谢！

【解决方案2】：

您可以尝试mat = TRUE 参数。这不是您正在寻找的，但它更接近：

library(psych)
mydata = data.frame(Defects.I = c(1,1), Defects.D = c(2,3), Treatment = c('A','B'))

    describeBy(mydata[1:2], mydata$Treatment, mat = TRUE)

给予

           item group1 vars n mean sd median trimmed mad min max range skew kurtosis se
Defects.I1    1      A    1 1    1 NA      1       1   0   1   1     0   NA       NA NA
Defects.I2    2      B    1 1    1 NA      1       1   0   1   1     0   NA       NA NA
Defects.D1    3      A    2 1    2 NA      2       2   0   2   2     0   NA       NA NA
Defects.D2    4      B    2 1    3 NA      3       3   0   3   3     0   NA       NA NA

【讨论】：

感谢您的帮助，但正如您所见，格式仍然不同，因为我希望尽可能将处理放在首位