这是一种使用lapply() 的方法,使用mtcars 数据集。我们将选择mpg 作为因变量,从数据集中提取剩余的列,然后使用lapply() 对indepVars 向量中的每个元素运行回归模型。每个模型的输出都保存到一个列表中,包括自变量的名称以及生成的模型对象。
indepVars <- names(mtcars)[!(names(mtcars) %in% "mpg")]
modelList <- lapply(indepVars,function(x){
result <- lm(mpg ~ mtcars[[x]],data=mtcars)
list(variable=x,model=result)
})
# print the first model
modelList[[1]]$variable
summary(modelList[[1]]$model)
然后可以使用提取运算符[[ 打印任何模型的内容。
...和输出:
> # print the first model
> modelList[[1]]$variable
[1] "cyl"
> summary(modelList[[1]]$model)
Call:
lm(formula = mpg ~ mtcars[[x]], data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.9814 -2.1185 0.2217 1.0717 7.5186
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.8846 2.0738 18.27 < 2e-16 ***
mtcars[[x]] -2.8758 0.3224 -8.92 6.11e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.206 on 30 degrees of freedom
Multiple R-squared: 0.7262, Adjusted R-squared: 0.7171
F-statistic: 79.56 on 1 and 30 DF, p-value: 6.113e-10
>
回应原始发帖人的评论,这里是将上述过程封装在 R 函数中所需的代码。函数regList() 采用数据框名称和因变量字符串,然后继续对传递给函数的数据框中的每个剩余变量运行因变量的回归。
regList <- function(dataframe,depVar) {
indepVars <- names(dataframe)[!(names(dataframe) %in% depVar)]
modelList <- lapply(indepVars,function(x){
message("x is: ",x)
result <- lm(dataframe[[depVar]] ~ dataframe[[x]],data=dataframe)
list(variable=x,model=result)
})
modelList
}
modelList <- regList(mtcars,"mpg")
# print the first model
modelList[[1]]$variable
summary(modelList[[1]]$model)
可以从单个模型对象中提取各种内容。输出如下:
> modelList <- regList(mtcars,"mpg")
> # print the first model
> modelList[[1]]$variable
[1] "cyl"
> summary(modelList[[1]]$model)
Call:
lm(formula = dataframe[[depVar]] ~ dataframe[[x]], data = dataframe)
Residuals:
Min 1Q Median 3Q Max
-4.9814 -2.1185 0.2217 1.0717 7.5186
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.8846 2.0738 18.27 < 2e-16 ***
dataframe[[x]] -2.8758 0.3224 -8.92 6.11e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.206 on 30 degrees of freedom
Multiple R-squared: 0.7262, Adjusted R-squared: 0.7171
F-statistic: 79.56 on 1 and 30 DF, p-value: 6.113e-10
>