panelmodel (plm) 对象上的 texreg；附加信息答案

【问题标题】：texreg on panelmodel (plm) object; additional gof informationpanelmodel (plm) 对象上的 texreg；附加信息
【发布时间】：2018-05-24 11:00:13
【问题描述】：

如何在texreg 调用中自定义拟合优度 (gof)？

我想展示一些plm 模型，但我只得到"Num. obs.", "Adj. R^2", 和, "R^2"（参见下面的工作示例）。但是，我希望全部显示小的 n、T、F-statistic 和 p-value，我在默认的 summary() 调用中获得的所有内容。

我得到的一个例子。首先是一些数据和所需的包，

# install.packages(c("wooldridge", "plm", "texreg"), dependencies = TRUE)
library(wooldridge) 
data(wagepan)
library(plm)

二、部分机型，

POLS <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
            factor(year), data = wagepan, index=c("nr","year") , model="pooling")

RE <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married + union +
          factor(year), data = wagepan, index = c("nr","year") , model = "random") 

FE <- plm(lwage ~ I(exper^2) + married + union + factor(year), 
          data = wagepan, index = c("nr","year"), model="within")

第三，我当前的texreg 调用及其输出，

# library(texreg)                      
texreg::screenreg(list(POLS, RE, FE), custom.coef.map = list('married' = 'Marrtied', 'union' = 'Union'))
#> ================================================
#>            Model 1      Model 2      Model 3    
#> ------------------------------------------------
#> Marrtied      0.11 ***     0.06 ***     0.05 *  
#>              (0.02)       (0.02)       (0.02)   
#> Union         0.18 ***     0.11 ***     0.08 ***
#>              (0.02)       (0.02)       (0.02)   
#> ------------------------------------------------
#> R^2           0.19         0.18         0.18    
#> Adj. R^2      0.19         0.18         0.06    
#> Num. obs.  4360         4360         4360       
#> ================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05

我确实尝试添加, include.fstatistic = TRUE，但似乎我无法做到这一点。因为我需要一些额外的定制。

我的目标是这样的，

#> ------------------------------------------------
#> Obs.  (N)   4360         4360         4360       
#> Indiv.(n)    545          545          545       
#> Time  (T)      8            8            8       
#> R^2           0.19         0.18         0.18    
#> Adj. R^2      0.19         0.18         0.06    
#> F-stat       72.458       68.4124      83.8515  
#> P-value        (2.22e-16)   (2.22e-16)   (2.22e-16) 
#> ================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05

【问题讨论】：

标签： texreg plm texreg r latex plm texreg

【解决方案1】：

跟进@jaySF 的回答，创建您自己的包含默认值的提取函数，并注册它：

custom_extract_plm <- function(model, ...) {
  s <- summary(model)
  ex.1 <- texreg:::extract.plm(model, ...)

  fv.1 <- s$fstatistic$statistic
  pv.1 <- s$fstatistic$p.value

  ex.1@gof.names <- c(ex.1@gof.names, "F-stat", "P-value")
  ex.1@gof <- c(ex.1@gof, fv.1, pv.1)
  ex.1@gof.decimal <- c(ex.1@gof.decimal, TRUE, TRUE)

  ex.1
}

setMethod(texreg:::extract, signature = className("plm", "plm"), custom_extract_plm)

现在你得到一个 F 统计量：

> texreg::screenreg(list(POLS, RE, FE), custom.coef.map = list('married' = 'Marrtied', 'union' = 'Union'))

================================================
           Model 1      Model 2      Model 3    
------------------------------------------------
Marrtied      0.11 ***     0.06 ***     0.05 *  
             (0.02)       (0.02)       (0.02)   
Union         0.18 ***     0.11 ***     0.08 ***
             (0.02)       (0.02)       (0.02)   
------------------------------------------------
R^2           0.19         0.18         0.18    
Adj. R^2      0.19         0.18         0.06    
Num. obs.  4360         4360         4360       
F-stat       72.46        68.41        83.85    
P-value       0.00         0.00         0.00    
================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

【讨论】：

感谢您对讨论的贡献。我想知道您是否有时间在您的答案中包含小n，上面的Indiv. (n) 是什么？

【解决方案2】：

您可以使用texreg::extract() 破解它。

为了获得“小n”，我们首先需要一个小函数。

getIndex <- function(fit){
  # extracts number of factor levels of index variables
  # from raw data used in models
  index.names <- as.character(as.list(summary(fit)$call)$index)[-1]
  if (length(index.names == 1)){
    df.name <- as.character(as.list(summary(fit)$call)$data)
    index.df <- get(df.name)[, index.names]
    length(unique(index.df))
  }
  if (length(index.names == 2)){
    df.name <- as.character(as.list(summary(fit)$call)$data)
    index.df <- get(df.name)[, index.names]
    cbind(length(unique(index.df[, 1])), 
          length(unique(index.df[, 2]))) 
  } else {
    stop("no index variables specified in model")
  }

}

然后继续提取。

fv.1 <- summary(POLS)$fstatistic$statistic  # get F statistic
pv.1 <- summary(POLS)$fstatistic$p.value  # get p value
ns.1 <- getIndex(POLS)[1]  # get small n
tm.1 <- getIndex(POLS)[2]  # get times

library(texreg)
ex.1 <- extract(POLS)  # extract coefficients and GOF measures
ex.1@gof.names <- c(ex.1@gof.names[1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")  # the GOF names
ex.1@gof <- c(ex.1@gof[1:3], ns.1, tm.1, fv.1, pv.1)  # the GOF values
ex.1@gof.decimal <- c(ex.1@gof.decimal[1:3], FALSE, FALSE, TRUE, TRUE)  # numeric or integer

fv.2 <- summary(RE)$fstatistic$statistic
pv.2 <- summary(RE)$fstatistic$p.value
ns.2 <- getIndex(RE)[1]
tm.2 <- getIndex(RE)[2]

ex.2 <- extract(RE)
ex.2@gof.names <- c(ex.2@gof.names[1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")
ex.2@gof <- c(ex.2@gof[1:3], ns.2, tm.2, fv.2, pv.2)
ex.2@gof.decimal <- c(ex.2@gof.decimal[1:3], FALSE, FALSE, TRUE, TRUE)

fv.3 <- summary(FE)$fstatistic$statistic
pv.3 <- summary(FE)$fstatistic$p.value
ns.3 <- getIndex(FE)[1]
tm.3 <- getIndex(FE)[2]

ex.3 <- extract(FE)
ex.3@gof.names <- c(ex.3@gof.names[1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")
ex.3@gof <- c(ex.3@gof[1:3], ns.3, tm.3, fv.3, pv.3)
ex.3@gof.decimal <- c(ex.3@gof.decimal[1:3], FALSE, FALSE, TRUE, TRUE)

产量

> screenreg(list(ex.1, ex.2, ex.3))

=======================================================
                  Model 1      Model 2      Model 3    
-------------------------------------------------------
[TRUNCATED...]
-------------------------------------------------------
R^2                  0.19         0.18         0.18    
Adj. R^2             0.19         0.18         0.06    
Num. obs.         4360         4360         4360       
Indiv.(n)          545          545          545       
Time (T)             8            8            8       
F-stat              72.46        68.41        83.85    
P-value              0.00         0.00         0.00    
=======================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

看看例如str(extract(FE)) 将此应用于其他 GOF。

要将其包装到函数中，请查看@Neal Fultz 答案中的代码。

【讨论】：

哇，确实需要大量的 hacking 才能到达那里。我曾想象会有一些我忽略的简单调用或命令。感谢您展示这种方法！
欢迎您！希望我没有重新发明轮子，不过我也找不到简单的命令。
我会坚持几个小时，看看是否有人能指着轮子。再次感谢。
您愿意编辑您的答案以包含545 Indiv.(n) 吗？如果是这样，我很乐意奖励你。
当然，我很荣幸。这真是一个 hack，困难在于，n 和 time 的小变量似乎没有保存在 summary(model) 中，所以我们必须查阅原始数据来获取它们。我也用其他一些数据测试了getIndex()，并且在大多数情况下应该可以工作。使用例如测试功能plm(Sepal.Length ~ Sepal.Width, iris) 与 plm(Sepal.Length ~ Sepal.Width, iris, index="Species").

【解决方案3】：

这是使用我的huxtable 包中的huxreg 的一种可能性。您还需要安装 broom 软件包。

library(huxtable)

ht <- huxreg(POLS, RE, FE, 
      coefs = c("Married" = "married", "Union" = "union"), 
      statistics = c("Obs. (N)" = "nobs", "Adj. R^2" = "adj.r.squared", 
        "F statistic" = "statistic", "P value" = "p.value"))

时间单位的数量不在broom::glance中，所以我们手动添加——这里有一个简单的方法：

Ts <- purrr::map_int(list(POLS, RE, FE), list(pdim, "nT", "T"))
ns <- purrr::map_int(list(POLS, RE, FE), list(pdim, "nT", "n"))
ht <- insert_row(ht, c("Time (T)", Ts), after = 6)
ht <- insert_row(ht, c("Indiv. (n)", ns), after = 6)
ht

───────────────────────────────────────────────────────────────
                     (1)             (2)             (3)       
              ─────────────────────────────────────────────────
  Married          0.108 ***       0.064 ***       0.047 *     
                  (0.016)         (0.017)         (0.018)      
  Union            0.182 ***       0.106 ***       0.080 ***   
                  (0.017)         (0.018)         (0.019)      
              ─────────────────────────────────────────────────
  Obs. (N)      4360            4360            4360           
  Indiv. (n)     545             545             545           
  Time (T)         8               8               8           
  Adj. R^2         0.187           0.178           0.061       
  F statistic     72.459          68.412          83.851       
  P value          7.250e-186      5.813e-176      1.655e-156  
───────────────────────────────────────────────────────────────
  *** p < 0.001; ** p < 0.01; * p < 0.05. 

Column names: names, model1, model2, model3

这将在 Rmarkdown 文档中自动打印为 TeX/HTML。您可以对其进行编辑以添加行、列和进一步的格式。

【讨论】：

哇，感谢您展示了您的 huxtable 包如何回答我的问题。也许我应该使用它！我正在导出到 LaTeX。两个快速的问题。你是怎么找到statistic和p.value这两个名字的，问我还是小号n，我叫Indiv.(n)。其次，你写huxtable的动机是什么？我想你有链接吗？ :)
这些名字来自broom::glance。小号n 那里不可用...你可以在pdim(model)$nT$n 中找到它，我通过查看plm:::print.summary.plm 的代码了解到这一点！
更新了我的答案以添加个人
非常好。谢谢！然而，我有点不愿意奖励这个答案的弹性，因为我在最初的问题中要求texreg 解决方案。这并不是说你没有回答我的问题！