【问题标题】:Why need to tune lambda with caret::train(..., method = "glmnet") and cv.glmnet()?为什么需要使用 caret::train(..., method = "glmnet") 和 cv.glmnet() 来调整 lambda?
【发布时间】:2018-05-11 14:43:02
【问题描述】:

正如我们所见,caret::train(..., method = "glmnet") 与交叉验证或cv.glmnet() 都实现了可以找到最小化交叉验证错误的lambda.min。最终的最佳拟合模型应该是带有lambda.min 的模型。那么,为什么我们需要为训练过程设置一个lambda 值的网格呢?

【问题讨论】:

    标签: r glmnet


    【解决方案1】:

    我们为glmnet 模型使用自定义调整网格,因为默认调整网格非常小,我们可能想要探索更多潜在的glmnet 模型。

    glmnet 能够拟合 2 种不同的惩罚模型,它有 2 个调整参数:

    1. 阿尔法
      • 岭回归(或 alpha = 0)
      • Lasso 回归(或 alpha = 1)
    2. λ
      • 对系数的惩罚强度

    glmnet 模型可以一次拟合多个模型(对于单个 alphalambda 的所有值同时拟合),我们可以传递大量 lambda 值来控制惩罚量模型。

    train() 足够聪明,每个 alpha 值只能拟合一个模型,并将所有 lambda 值传递到一个以同时拟合。

    例子:

    # Make a custom tuning grid
    tuneGrid <- expand.grid(alpha = 0:1, lambda = seq(0.0001, 1, length = 10))
    
    # Fit a model
    model <- train(y ~ ., overfit, method = "glmnet",
      tuneGrid = tuneGrid, trControl = myControl
    )
    
    
    # Sample Output
    Warning message: The metric "Accuracy" was not in the result set. ROC will be used instead.
    + Fold01: alpha=0, lambda=1 
    - Fold01: alpha=0, lambda=1 
    + Fold01: alpha=1, lambda=1 
    - Fold01: alpha=1, lambda=1 
    + Fold02: alpha=0, lambda=1 
    - Fold02: alpha=0, lambda=1 
    + Fold02: alpha=1, lambda=1 
    - Fold02: alpha=1, lambda=1 
    + Fold03: alpha=0, lambda=1 
    - Fold03: alpha=0, lambda=1 
    + Fold03: alpha=1, lambda=1 
    - Fold03: alpha=1, lambda=1 
    + Fold04: alpha=0, lambda=1 
    - Fold04: alpha=0, lambda=1 
    + Fold04: alpha=1, lambda=1 
    - Fold04: alpha=1, lambda=1 
    + Fold05: alpha=0, lambda=1 
    - Fold05: alpha=0, lambda=1 
    + Fold05: alpha=1, lambda=1 
    - Fold05: alpha=1, lambda=1 
    + Fold06: alpha=0, lambda=1 
    - Fold06: alpha=0, lambda=1 
    + Fold06: alpha=1, lambda=1 
    - Fold06: alpha=1, lambda=1 
    + Fold07: alpha=0, lambda=1 
    - Fold07: alpha=0, lambda=1 
    + Fold07: alpha=1, lambda=1 
    - Fold07: alpha=1, lambda=1 
    + Fold08: alpha=0, lambda=1 
    - Fold08: alpha=0, lambda=1 
    + Fold08: alpha=1, lambda=1 
    - Fold08: alpha=1, lambda=1 
    + Fold09: alpha=0, lambda=1 
    - Fold09: alpha=0, lambda=1 
    + Fold09: alpha=1, lambda=1 
    - Fold09: alpha=1, lambda=1 
    + Fold10: alpha=0, lambda=1 
    - Fold10: alpha=0, lambda=1 
    + Fold10: alpha=1, lambda=1 
    - Fold10: alpha=1, lambda=1 
    Aggregating results
    Selecting tuning parameters
    Fitting alpha = 1, lambda = 1 on full training set
    
    
    # Print model to console
    model
    
    
    # Sample Output
    glmnet 
    
    250 samples
    200 predictors
      2 classes: 'class1', 'class2' 
    
    No pre-processing
    Resampling: Cross-Validated (10 fold) 
    Summary of sample sizes: 225, 225, 225, 225, 224, 226, ... 
    Resampling results across tuning parameters:
    
      alpha  lambda  ROC        Sens  Spec     
      0      0.0001  0.3877717  0.00  0.9786232
      0      0.1112  0.4352355  0.00  1.0000000
      0      0.2223  0.4546196  0.00  1.0000000
      0      0.3334  0.4589674  0.00  1.0000000
      0      0.4445  0.4718297  0.00  1.0000000
      0      0.5556  0.4762681  0.00  1.0000000
      0      0.6667  0.4783514  0.00  1.0000000
      0      0.7778  0.4826087  0.00  1.0000000
      0      0.8889  0.4869565  0.00  1.0000000
      0      1.0000  0.4869565  0.00  1.0000000
      1      0.0001  0.3368659  0.05  0.9188406
      1      0.1112  0.5000000  0.00  1.0000000
      1      0.2223  0.5000000  0.00  1.0000000
      1      0.3334  0.5000000  0.00  1.0000000
      1      0.4445  0.5000000  0.00  1.0000000
      1      0.5556  0.5000000  0.00  1.0000000
      1      0.6667  0.5000000  0.00  1.0000000
      1      0.7778  0.5000000  0.00  1.0000000
      1      0.8889  0.5000000  0.00  1.0000000
      1      1.0000  0.5000000  0.00  1.0000000
    
    ROC was used to select the optimal model using  the largest value.
    The final values used for the model were alpha = 1 and lambda = 1.
    
    
    # Plot model
    plot(model)
    

    【讨论】:

    • @Levande 我不知道你可以对ridge regressionlasso 使用glmnet!谢谢
    猜你喜欢
    • 2015-05-23
    • 2014-09-13
    • 2017-12-25
    • 2018-12-23
    • 2016-09-29
    • 2014-05-23
    • 2018-07-17
    • 1970-01-01
    • 2014-06-15
    相关资源
    最近更新 更多