【问题标题】:xgboost.cv returns an area under the curve less than 0.5xgboost.cv 返回小于 0.5 的曲线下面积
【发布时间】:2017-06-16 02:32:08
【问题描述】:

运行 xgboost 交叉验证我得到曲线下的面积

我正在运行xgboost.cv 如下

best_params_grid_search={'base_score': 0.5,
 'colsample_bylevel': 1,
 'colsample_bytree': 0.8,
 'gamma': 0,
 'learning_rate': 0.3,
 'max_delta_step': 0,
 'max_depth': 3,
 'min_child_weight': 3,
 'missing': nan,
 'n_estimators': 15,
 'objective': 'binary:logistic',
 'reg_alpha': 0,
 'reg_lambda': 1,
 'scale_pos_weight': 1,
 'seed': 5,
 'silent': 1,
 'subsample': 0.8}

skf_inner = StratifiedKFold(n_splits=n_fold_inner,random_state=5, shuffle=True)

dtrain  = xgb.DMatrix(X_train,  label=y_train, missing = np.nan)

num_rounds = 20
cv_xgb4 = xgb.cv(best_params_grid_search,
dtrain,num_boost_round =num_rounds,folds=skf_inner,metrics={'auc'},seed=5)

但我的 AUC

    test-auc-mean  test-auc-std  train-auc-mean  train-auc-std
0        0.402675      0.088828        0.777729       0.058559
1        0.390638      0.124389        0.890424       0.044356
2        0.418827      0.068236        0.932992       0.031358
3        0.448971      0.073219        0.946747       0.011304
4        0.460597      0.118598        0.956311       0.008302
5        0.437963      0.057661        0.970979       0.005968
6        0.461831      0.095017        0.978789       0.010346
7        0.422428      0.111894        0.977095       0.014329
8        0.419650      0.117329        0.983260       0.011606
9        0.433745      0.106113        0.989522       0.008979
10       0.440947      0.097941        0.992227       0.009497
11       0.449588      0.071629        0.994396       0.006438
12       0.429218      0.061360        0.995858       0.004400
13       0.455144      0.064862        0.998051       0.002757
14       0.443416      0.057515        0.999513       0.000689
15       0.440535      0.079628        0.999513       0.000689
16       0.446296      0.077557        1.000000       0.000000
17       0.450000      0.074674        1.000000       0.000000
18       0.468107      0.092640        1.000000       0.000000
19       0.451029      0.096165        1.000000       0.000000

提前谢谢你。

【问题讨论】:

    标签: python cross-validation xgboost auc


    【解决方案1】:

    首先,当 AUChttps://en.wikipedia.org/wiki/Receiver_operating_characteristic 其次,您的模型高度过度拟合(train-auc-mean 为 1.0)。这意味着您应该使您的算法更加稳健。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2013-08-29
      • 2011-06-24
      • 1970-01-01
      • 2020-03-30
      • 1970-01-01
      相关资源
      最近更新 更多