【发布时间】:2013-10-20 00:04:57
【问题描述】:
在处理 glmnet 与 glm 时,我遇到了 lambda=0 和 family="poisson" 的收敛问题。我的理解是,当 lambda=0(和 alpha=1,默认值)时,答案应该基本相同。
下面的代码与 glmnet 帮助页面 (?glmnet) 上的泊松示例略有不同。唯一的变化是 nzc = p 使得所有变量都在真实模型中
N=1000; p=50
nzc=p
x=matrix(rnorm(N*p),N,p)
beta=rnorm(nzc)
f = x[,seq(nzc)]%*%beta
mu=exp(f)
y=rpois(N,mu)
#With lambda=0 glmnet throws the convergence error shown below
fit=glmnet(x,y,family="poisson",lambda=0)
#It works with default lambda passed in
# but estimates are quite different from glm.
fit=glmnet(x,y,family="poisson") #use default lambdas
fit2=glm(y~x,family="poisson")
plot(coef(fit2)[2:(p+1)],
coef(fit,s=min(fit$lambda))[2:(p+1)],
xlab="glm",ylab="glmnet")
abline(0,1)
#works fine with gaussian response and lambda=0 or default lambda
#glm and glmnet identical
mu = f
y=rnorm(N,mu)
fit=glmnet(x,y,family="gaussian",lambda=0)
fit2=glm(y~x)
plot(coef(fit2)[2:(p+1)], coef(fit)[2:(p+1)])
abline(0,1)
这是错误信息
Warning messages:
1: from glmnet Fortran code (error code -1); Convergence for 1th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
2: In getcoef(fit, nvars, nx, vnames) :an empty model has been returned; probably a convergence issue
更新: 问题似乎在于 glmnet 在 family="poisson" 时估计的截距,与 lambda 本身的设置无关。
fit=glmnet(x,y,family="poisson")
#intercept should be close to 0
coef(fit)[1,]
#but it is huge
#passing in intercept=FALSE however generates the convergence error again
fit=glmnet(x,y,family="poisson", intercept=FALSE)
【问题讨论】:
标签: r bigdata glm poisson glmnet