【问题标题】:Estimate logit models for factor variable produces error估计因子变量的 logit 模型会产生错误
【发布时间】:2018-04-14 12:55:02
【问题描述】:

我无法估计一个因子变量为因变量的 logit 模型。我创建了一个可重现的示例来更好地解释并显示错误消息。

## create a reproducible example that replicates the problem
set.seed(12) # reproducibility of the "randomly" generated data. 
df<-data.frame(dummy=as.factor(rep(c("yes","no"),100)), # factor encoding
               x=rnorm(n = 200,mean = 5,sd = 1)) # some predictor variable


# calculate regression with different encodings
summary(glm(formula = dummy~x,data = df)) # does not work

这种方法的错误信息是

Error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  : 
  NA/NaN/Inf in 'y'
    In addition: Warning messages:
1: In Ops.factor(y, mu) : ‘-’ not meaningful for factors
2: In Ops.factor(eta, offset) : ‘-’ not meaningful for factors
3: In Ops.factor(y, mu) : ‘-’ not meaningful for factors

我不太明白这条消息。数据规模(因子)有什么问题,还是我如何应用该函数的问题?任何帮助将非常感激。

【问题讨论】:

    标签: r char logistic-regression


    【解决方案1】:

    添加family="binomial" 以指定这是一个逻辑回归并且有效:

    > fit <- (glm(formula = dummy~x, data = df, family="binomial")) 
    > summary(fit)
    
    Call:
    glm(formula = dummy ~ x, family = "binomial", data = df)
    
    Deviance Residuals: 
         Min        1Q    Median        3Q       Max  
    -1.18456  -1.17736  -0.00041   1.17736   1.18342  
    
    Coefficients:
                 Estimate Std. Error z value Pr(>|z|)
    (Intercept)  0.028747   0.734674   0.039    0.969
    x           -0.005782   0.145003  -0.040    0.968
    
    (Dispersion parameter for binomial family taken to be 1)
    
        Null deviance: 277.26  on 199  degrees of freedom
    Residual deviance: 277.26  on 198  degrees of freedom
    AIC: 281.26
    
    Number of Fisher Scoring iterations: 3
    

    【讨论】:

      猜你喜欢
      • 2021-10-22
      • 1970-01-01
      • 2021-07-17
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-01-11
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多