【问题标题】:Return variance from Bartlett's test of homogeneity of variance in R来自 Bartlett 的 R 中方差同质性检验的返回方差
【发布时间】:2016-06-30 21:51:56
【问题描述】:

Bartlett 检验允许您测试不同组中的方差是否相同。

R 中的stats 包具有函数bartlett.test。这是一个使用 R 中可用数据集的示例。

bt <- bartlett.test(count ~ spray, data = InsectSprays)

您如何从bartlett.test 获得实际差异?

我似乎在 bt 对象中找不到这个

names(bt)
[1] "data.name" "method"    "p.value"   "parameter" "statistic"

您可以使用var() 自己计算方差。一种方法是使用summaryBy

library(doBy)
summaryBy(count~spray, data=InsectSprays, FUN=var)

但是,您希望bartlett.test 提供每组的方差。同样,在 R 中计算 t.test 也会为您提供每组的平均值。那么,我们能否从 R 中的bartlett.test 中提取每组的方差,以及如何提取?

【问题讨论】:

  • bartlett.test 的手册不是说要使用 var.test,还是我误解了它?

标签: r variance


【解决方案1】:

不,很遗憾你不能。

方差不会出现在返回对象的结构中。

我阅读了函数的源代码,您可以通过重新编写它来提取它,但这将比执行您已有的解决方案要多得多的自定义工作。

你可以在这里看到我的意思:

#  File src/library/stats/R/bartlett.test.R
#  Part of the R package, https://www.R-project.org
#
#  Copyright (C) 1995-2015 The R Core Team
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  A copy of the GNU General Public License is available at
#  https://www.R-project.org/Licenses/

bartlett.test <- function(x, ...) UseMethod("bartlett.test")

bartlett.test.default <-
function(x, g, ...)
{
    LM <- FALSE
    if (is.list(x)) {
        if (length(x) < 2L)
            stop("'x' must be a list with at least 2 elements")
        DNAME <- deparse(substitute(x))
        if (all(sapply(x, function(obj) inherits(obj, "lm"))))
            LM <- TRUE
        else
            x <- lapply(x, function(x) x <- x[is.finite(x)])
        k <- length(x)
    }
    else {
        if (length(x) != length(g))
            stop("'x' and 'g' must have the same length")
        DNAME <- paste(deparse(substitute(x)), "and",
                       deparse(substitute(g)))
        OK <- complete.cases(x, g)
        x <- x[OK]
        g <- factor(g[OK])
        k <- nlevels(g)
        if (k < 2)
            stop("all observations are in the same group")
        x <- split(x, g)
    }

    if (LM) {
        n <- sapply(x, function(obj) obj$df.resid)
        v <- sapply(x, function(obj) sum(obj$residuals^2))
    } else {
        n <- sapply(x, "length") - 1
        if (any(n <= 0))
            stop("there must be at least 2 observations in each group")
        v <- sapply(x, "var")
    }

    n.total <- sum(n)
    v.total <- sum(n * v) / n.total
    STATISTIC <- ((n.total * log(v.total) - sum(n * log(v))) /
                  (1 + (sum(1 / n) - 1 / n.total) / (3 * (k - 1))))
    PARAMETER <- k - 1
    PVAL <- pchisq(STATISTIC, PARAMETER, lower.tail = FALSE)
    names(STATISTIC) <- "Bartlett's K-squared"
    names(PARAMETER) <- "df"

    RVAL <- list(statistic = STATISTIC,
                 parameter = PARAMETER,
                 p.value = PVAL,
                 data.name = DNAME,
                 method = "Bartlett test of homogeneity of variances")
    class(RVAL) <- "htest"
    return(RVAL)
}

bartlett.test.formula <-
function(formula, data, subset, na.action, ...)
{
    if(missing(formula) || (length(formula) != 3L))
        stop("'formula' missing or incorrect")
    m <- match.call(expand.dots = FALSE)
    if(is.matrix(eval(m$data, parent.frame())))
        m$data <- as.data.frame(data)
    ## need stats:: for non-standard evaluation
    m[[1L]] <- quote(stats::model.frame)
    mf <- eval(m, parent.frame())
    if(length(mf) != 2L)
        stop("'formula' should be of the form response ~ group")
    DNAME <- paste(names(mf), collapse = " by ")
    names(mf) <- NULL
    y <- do.call("bartlett.test", as.list(mf))
    y$data.name <- DNAME
    y
}

【讨论】:

    【解决方案2】:

    稍微比 Hack-R 建议的更容易破解,但他们是对的,像 sapply(split(InsectSprays,spray),function(x) var(x$count))(在基础 R 中完成所有操作)可能更容易。

    这里展示的技术很脆弱,因为它依赖于内置函数的确切形式;如果在 R.Safer 的未来版本中对该功能进行了微小的更改,它将停止工作,将整个功能改为 dump() 并根据自己的喜好进行修改,然后 source() 结果。

    bb <- stats:::bartlett.test.default
    bb2 <- body(bb)
    ## add a line to save the variances
    bb2[[12]] <- quote(ESTIMATE <- v)
    ## add the variances to the return list
    bb2[[13]] <- quote(RVAL <- list(statistic = STATISTIC, parameter = PARAMETER, estimate = ESTIMATE, p.value = PVAL, data.name = DNAME, method = "Bartlett test of homogeneity of variances"))
    ## restore the rest of the function
    bb2[14:15] <- body(bb)[13:14]
    body(bb) <- bb2
    

    现在把它放回stats命名空间:

    assignInNamespace("bartlett.test.default",bb,pos="package:stats")
    

    测试:

    (bt <- bartlett.test(count~spray,data=InsectSprays))
    ##  Bartlett test of homogeneity of variances
    ## 
    ## data:  count by spray
    ## Bartlett's K-squared = 25.96, df = 5, p-value = 9.085e-05
    ## sample estimates:
    ##         A         B         C         D         E         F 
    ## 22.272727 18.242424  3.901515  6.265152  3.000000 38.606061 
    

    您可以通过bt$estimate检索值。

    可能值得在 r-devel 上建议此作为对 bartlett.test 的增强:我能想到的唯一反驳论点是,如果有很多组正在测试,输出会很笨拙。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2011-02-25
      • 2020-06-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-04-26
      相关资源
      最近更新 更多