【发布时间】:2014-10-21 06:39:44
【问题描述】:
我正在使用 R 对我的数据集执行逻辑回归。我的数据集有 50 多个变量。 我正在运行以下代码:
glm(X...ResponseFlag ~ NetWorth + LOR + IntGrandChld + OccupInput, family = binomial, data = data)
当我看到 summary() 时,我得到以下输出:
> summary(ResponseModel)
Call:
glm(formula = X...ResponseFlag ~ NetWorth + LOR + IntGrandChld +
OccupInput, family = binomial, data = data)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.2785 -0.9576 -0.8925 1.3736 1.9721
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.971166 0.164439 -5.906 3.51e-09 ***
NetWorth 0.082168 0.019849 4.140 3.48e-05 ***
LOR -0.019716 0.006494 -3.036 0.0024 **
IntGrandChld -0.021544 0.085274 -0.253 0.8005
OccupInput2 0.005796 0.138390 0.042 0.9666
OccupInput3 0.471020 0.289642 1.626 0.1039
OccupInput4 -0.031880 0.120636 -0.264 0.7916
OccupInput5 -0.148898 0.129922 -1.146 0.2518
OccupInput6 -0.481183 0.416277 -1.156 0.2477
OccupInput7 -0.057485 0.218309 -0.263 0.7923
OccupInput8 0.505676 0.123955 4.080 4.51e-05 ***
OccupInput9 -0.382375 0.821362 -0.466 0.6415
OccupInputA -12.903334 178.064831 -0.072 0.9422
OccupInputB 0.581272 1.003193 0.579 0.5623
OccupInputC -0.034188 0.294507 -0.116 0.9076
OccupInputD 0.224634 0.385959 0.582 0.5606
OccupInputE -1.292358 1.072864 -1.205 0.2284
OccupInputF 14.132144 308.212341 0.046 0.9634
OccupInputH 0.622677 1.006982 0.618 0.5363
OccupInputU 0.087526 0.095740 0.914 0.3606
OccupInputV -1.010939 0.637746 -1.585 0.1129
OccupInputW 0.262031 0.256238 1.023 0.3065
OccupInputX 0.332209 0.428806 0.775 0.4385
OccupInputY 0.059771 0.157135 0.380 0.7037
OccupInputZ 0.638520 0.711979 0.897 0.3698
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 5885.1 on 4467 degrees of freedom
Residual deviance: 5809.6 on 4443 degrees of freedom
AIC: 5859.6
Number of Fisher Scoring iterations: 12
从输出中可以看出,出现了一些像 OccupInput2... 这样的新变量。实际上 OccupInput 的值是 1,2,3,...A,B,C,D.. 但 NetWorth,LOR 并没有发生这种情况。
我是 R 新手,没有任何解释,为什么会有新变量。
谁能给我一个解释?提前谢谢你。
【问题讨论】:
-
是的,
OccupInput是一个因素,它有这个价值。
标签: r logistic-regression