【发布时间】:2020-03-24 12:08:45
【问题描述】:
我正在使用回归模型(MWT1Best= 428 - 7.69*Diabetes - 72.1*AtrialFib - 130*DAF),如果我在变量Diabetes 和AtrialFib 中替换1 或0,我试图快速查看模型的值。当我使用prediction() 时,我很难理解为什么会出现此错误:
list("Diabetes" = prediction(r123, at = list(Diabetes = c(0, 1))),
"AtrialFib" = prediction(r123, at = list(AtrialFib = c(0, 1))),
"Diabetes*AtrialFib" = prediction(r123,
at = list(Diabetes = c(0, 1), AtrialFib = c(0, 1))))
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
variable lengths differ (found for 'factor(DAF)')
我的所有变量“Diabetes”、“AtrialFib”和“DAF”都具有相同的长度并且没有任何缺失值:
> length(COPD$Diabetes)
[1] 101
> length(DAF)
[1] 101
> length(COPD$AtrialFib)
[1] 101
> sum(is.na(COPD$Diabetes))
[1] 0
> sum(is.na(COPD$DAF))
[1] 0
> sum(is.na(COPD$AtrialFib))
[1] 0
这里是关于我的数据集的更多信息:
> str(COPD)
'data.frame': 101 obs. of 24 variables:
$ X : int 1 2 3 4 5 6 7 8 9 10 ...
$ ID : int 58 57 62 145 136 84 93 27 114 152 ...
$ AGE : int 77 79 80 56 65 67 67 83 72 75 ...
$ PackHistory : num 60 50 11 60 68 26 50 90 50 6 ...
$ COPDSEVERITY: Factor w/ 4 levels "MILD","MODERATE",..: 3 2 2 4 3 2 3 3 2 3 ...
$ MWT1 : int 120 165 201 210 204 216 214 214 231 226 ...
$ MWT2 : int 120 176 180 210 210 180 237 237 237 240 ...
$ MWT1Best : int 120 176 201 210 210 216 237 237 237 240 ...
$ FEV1 : num 1.21 1.09 1.52 0.47 1.07 1.09 0.69 0.68 2.13 1.06 ...
$ FEV1PRED : num 36 56 68 14 42 50 35 32 63 46 ...
$ FVC : num 2.4 1.64 2.3 1.14 2.91 1.99 1.31 2.23 4.38 2.06 ...
$ FVCPRED : int 98 65 86 27 98 60 48 77 80 75 ...
$ CAT : int 25 12 22 28 32 29 29 22 25 31 ...
$ HAD : num 8 21 18 26 18 21 30 2 6 20 ...
$ SGRQ : num 69.5 44.2 44.1 62 75.6 ...
$ AGEquartiles: int 4 4 4 1 1 2 2 4 3 3 ...
$ copd : int 3 2 2 4 3 2 3 3 2 3 ...
$ gender : Factor w/ 2 levels "0","1": 2 1 1 2 2 1 1 2 2 1 ...
$ smoking : int 2 2 2 2 2 1 1 2 1 2 ...
$ Diabetes : int 1 1 1 0 0 1 1 1 1 0 ...
$ muscular : int 0 0 0 0 1 0 0 0 0 1 ...
$ hypertension: int 0 0 0 1 1 0 0 0 0 0 ...
$ AtrialFib : int 1 1 1 1 0 1 1 1 1 0 ...
$ IHD : int 0 1 0 0 0 0 0 0 0 0 ...
我通过将 Diabetes 和 AtrialFib 相乘创建了 DAF,还有更多关于 r123 的信息
> DAF<-COPD$Diabetes*COPD$AtrialFib
> str(DAF)
int [1:101] 1 1 1 0 0 1 1 1 1 0 ...
> r123<-lm(MWT1Best~factor(Diabetes)+factor(AtrialFib)+factor(DAF), data=COPD)
> summary(r123)
Call:
lm(formula = MWT1Best ~ factor(Diabetes) + factor(AtrialFib) +
factor(DAF), data = COPD)
Residuals:
Min 1Q Median 3Q Max
-218.15 -51.88 18.70 51.85 270.86
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 428.14 10.39 41.200 < 2e-16 ***
factor(Diabetes)1 -7.69 28.02 -0.274 0.78436
factor(AtrialFib)1 -72.05 29.21 -2.467 0.01541 *
factor(DAF)1 -130.11 47.70 -2.727 0.00759 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 86.32 on 96 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.3635, Adjusted R-squared: 0.3437
F-statistic: 18.28 on 3 and 96 DF, p-value: 1.841e-09
【问题讨论】:
-
我不确定哪个包包含
prediction函数。但是基本的predict()函数使用的是数据框而不是列表。 -
感谢您的指点。该函数来自一个同名的包
prediction,它应该通过predict从模型对象中提取预测值并返回一个数据框。我试图像你说的那样删除列表,但不幸的是遇到了类似的错误:> prediction(r123, data=find_data(r123, parent.frame(COPD$Diabetes)), at = list(Diabetes=c(0,1))) Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : variable lengths differ (found for 'factor(DAF)')
标签: r