【发布时间】:2019-01-31 10:00:51
【问题描述】:
我在两个程序中都使用了逻辑回归方法,并且想知道为什么我得到不同的结果,尤其是在系数方面。结果 Infection 为 (1, 0),Flushed 是一个连续变量。
Python:
import statsmodels.api as sm
logit_model=sm.Logit(data['INFECTION'], data['Flushed'])
result=logit_model.fit()
print(result.summary())
结果:
Logit Regression Results
==============================================================================
Dep. Variable: INFECTION No. Observations: 414
Model: Logit Df Residuals: 413
Method: MLE Df Model: 0
Date: Fri, 24 Aug 2018 Pseudo R-squ.: -1.388
Time: 15:47:42 Log-Likelihood: -184.09
converged: True LL-Null: -77.104
LLR p-value: nan
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
Flushed -0.6467 0.070 -9.271 0.000 -0.783 -0.510
==============================================================================
R:
mylogit <- glm(INFECTION ~ Flushed, data = cvc, family = "binomial")
summary(mylogit)
结果:
Call:
glm(formula = INFECTION ~ Flushed, family = "binomial", data = cvc)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.0598 -0.3107 -0.2487 -0.2224 2.8051
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.91441 0.38639 -10.131 < 2e-16 ***
Flushed 0.22696 0.06049 3.752 0.000175 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
【问题讨论】:
-
请发布包含数据集 URL 的可重现代码。
-
相关:Why are the logistic regression results different between statsmodels and R?。请在“Signif.code:”下方发布其余的 R 摘要。你有相同数量的DF吗?你是如何处理因子或假人的?
Df Model: 0 Df Residuals: 413听起来不对
标签: python r statistics regression