心理“描述”和“describeBy 类到 data.frame 错误答案

【问题标题】：Psych "describe" and "describeBy class to data.frame error心理“描述”和“describeBy 类到 data.frame 错误
【发布时间】：2022-01-12 23:54:30
【问题描述】：

psych的describe函数有以下问题：我想描述一个数据框的选定变量，然后用subset 和select 删除一些结果。这似乎只适用于数据框，但我得到了一个 describe 类。对我来说，它似乎有时有效，有时无效，我认为这实际上是不可能的。然而，确实它工作了几次，我已经可以保存输出，安排得很好，完全符合我想要的样子。但现在它再次返回类describe 无法转换为数据帧的错误。我看到的问题可能是我得到了一个列表列表（至少环境是这样说的）。由于我是一个完全的编程新手，所以我无法解决这个问题，即使在搜索了如何转换这个类之后，我也只是不明白。

Descriptives = describe(NumericData[5:44], na.rm = TRUE, interp = FALSE, 
                        skew = TRUE, ranges = TRUE, trim = .1, type = 3, 
                        check = TRUE, fast = NULL, quant = c(.25, .50, .75), 
                        IQR = FALSE)
Descriptives = as.data.frame(Descriptives) 
Descriptives = subset(Descriptives, select = -c(vars, median, trimmed, mad, range))
colnames(Descriptives) = c("N", "MEAN", "SD", "MIN", "MAX", "SKEW", "KURTOSIS", "SE", "Q1", "MEDIAN", "Q3")
Descriptives = round(Descriptives, digits = 4)
options(max.print = 1000)
print(as.data.frame(Descriptives))
write.table(Descriptives, file = "Descriptives.txt", sep = ",")

【问题讨论】：

如果没有reproducible 数据样本，我们将无法运行您的代码或查看任何输出。错误究竟发生在哪里？逐行运行代码开始调试。此外，如果问题在于转换为数据框，最后 4 行很可能与问题无关——您可以将它们从问题中删除并专注于识别问题
再看一遍，psych::describe 和 psych::describeBy 都采用数据框或矩阵并返回一个数据框。您是否尝试转换为数据框以删除 describe 类？如果我将您的数据换成包文档中的示例数据集之一，我不会收到任何错误

标签： r class psych

【解决方案1】：

好的，所以错误现在发生在每一行（第一个块 = 第 1 行），describe 函数实际上工作正常，但是我无法选择结果变量的子集（第 3 行），这里说的是参数子集缺少或命名左侧的列（第 4 行），这告诉我试图为小于二维的对象设置列名。此外，round（第 5 行）返回一个错误，指出不能将数值参数用于非数学函数。之前在第 2 行中发生的错误“无法将类‘‘describe’’强制转换为 data.frame”，现在我想打印它时出现。它工作了一次，最近至少没有选择结果变量的子集，但现在没有任何工作，我不明白为什么..代码保持不变。

我使用的数据集的 2 行：

structure(list(Age = c(24, 23, 44, 48, 35, 56, 64, 29, 20, 62, 
35, 31, 32, 60, 57, 66, 46, 18, 52, 63, 64, 35, 54, 58, 61, 52, 
52, 33, 49, 28, 22, 27, 40, 53, 18, 19, 43, 44, 26, 28, 38, 18, 
50, 45, 23, 38, 50, 36, 72, 62, 33, 28, 29, 42, 48, 42, 29, 70, 
27, 33, 22, 62, 67, 20, 32, 22, 32, 67, 29, 55, 49, 19, 52, 20, 
30, 24, 18, 24, 23, 22, 19, 20, 29, 22, 20, 19, 21, 18, 22, 22, 
18, 24, 22, 24, 19, 25, 24, 25, 20, 21, 23, 39, 60, 53, 47, 48, 
40, 29, 24, 27, 21, 21, 27, 22, 20, 23, 36, 22, 25, 27, 66, 54, 
54, 64, 49, 40), FTND = c(5, 7, 0, 6, 0, 6, 0, NA, 3, 4, 0, 7, 
NA, 0, 4, 3, 4, 1, 0, 6, 0, 5, 0, NA, NA, 3, 0, 2, NA, 0, 0, 
0, NA, NA, NA, NA, NA, 4, 0, 10, NA, NA, 8, NA, 3, 7, 0, 0, 5, 
2, 0, 6, 7, 0, 4, 2, 0, NA, 0, 0, 0, 0, 0, 0, 4, 0, 0, NA, 3, 
NA, NA, NA, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 
-126L), class = "data.frame")

我用 describe 得到的数据集是一个列表列表，其中不包含结果变量，而是我原始数据集的所有变量。同样通过使用 (dput(Descriptives[5:6]) 它应该打印变量 age 和 FTND 而不是 EmQ （实际上是变量/第 9 行）：

structure(list(EmQ = structure(list(descript = "EmQ", units = NULL, 
    format = NULL, counts = c(n = "97", missing = "29", distinct = "39", 
    Info = "0.998", Mean = "44.04", Gmd = "12.33", `.05` = "23.6", 
    `.10` = "27.6", `.25` = "38.0", `.50` = "45.0", `.75` = "50.0", 
    `.90` = "58.4", `.95` = "60.2"), values = list(value = c(19, 
    21, 22, 24, 25, 27, 28, 30, 32, 33, 34, 36, 37, 38, 39, 40, 
    41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 
    56, 57, 58, 59, 60, 61, 64, 66), frequency = structure(c(2, 
    2, 1, 2, 2, 1, 1, 2, 2, 2, 3, 1, 3, 1, 2, 2, 3, 4, 3, 4, 
    8, 7, 3, 6, 2, 4, 3, 1, 1, 1, 1, 4, 2, 1, 2, 3, 2, 2, 1), .Dim = 39L)), 
    extremes = c(L1 = 19, L2 = 21, L3 = 22, L4 = 24, L5 = 25, 
    H5 = 59, H4 = 60, H3 = 61, H2 = 64, H1 = 66)), class = "describe"), 
    EmQ10 = structure(list(descript = "EmQ10", units = NULL, 
        format = NULL, counts = c(n = "108", missing = "18", 
        distinct = "19", Info = "0.993", Mean = "10.16", Gmd = "4.346", 
        `.05` = "4.0", `.10` = "5.7", `.25` = "7.0", `.50` = "10.0", 
        `.75` = "13.0", `.90` = "15.0", `.95` = "16.0"), values = list(
            value = c(2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 
            14, 15, 16, 17, 18, 19, 20), frequency = structure(c(1, 
            3, 4, 3, 5, 13, 10, 11, 10, 11, 5, 11, 5, 6, 5, 2, 
            1, 1, 1), .Dim = 19L)), extremes = c(L1 = 2, L2 = 3, 
        L3 = 4, L4 = 5, L5 = 6, H5 = 16, H4 = 17, H3 = 18, H2 = 19, 
        H1 = 20)), class = "describe")), descript = "NumericData[5:44]", dimensions = c(126L, 
2L), class = "describe")

我通过 describeBy 获得的数据。还有一个列表，列表 1（？）中只有 2 个组，即对照组和患者，还包含所需的结果变量，如修剪、中位数作为我猜的属性）：

structure(list(NULL, NULL), .Dim = 2L, .Dimnames = list(Group = c(NA_character_, 
NA_character_)))

很抱歉，我的帖子很长，我不知道如何更好地表达它..

【讨论】：

【解决方案2】：

这是我曾经得到并想再次得到的：


      N      MEAN    SD     MIN MAX  SKEW   KURTOSIS  SE     Q1    MEDIAN   Q3
Age  126    36.254  15.6578 18  72  0.6067  -0.9925  1.3949 22.25   30.5    49
FTND 107    1.2617  2.3121   0  10  1.7475  2.0378   0.2235   0       0     1.5

【讨论】：

这应该添加到您的原始问题中，而不是作为答案发布