【发布时间】:2017-08-05 05:52:04
【问题描述】:
我正在尝试使用XGBoost,并将eval_metric 优化为auc(如here 所述)。
直接使用分类器时效果很好,但当我尝试将其用作pipeline 时会失败。
将.fit 参数传递给sklearn 管道的正确方法是什么?
例子:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris
from xgboost import XGBClassifier
import xgboost
import sklearn
print('sklearn version: %s' % sklearn.__version__)
print('xgboost version: %s' % xgboost.__version__)
X, y = load_iris(return_X_y=True)
# Without using the pipeline:
xgb = XGBClassifier()
xgb.fit(X, y, eval_metric='auc') # works fine
# Making a pipeline with this classifier and a scaler:
pipe = Pipeline([('scaler', StandardScaler()), ('classifier', XGBClassifier())])
# using the pipeline, but not optimizing for 'auc':
pipe.fit(X, y) # works fine
# however this does not work (even after correcting the underscores):
pipe.fit(X, y, classifier__eval_metric='auc') # fails
错误:TypeError: before_fit() got an unexpected keyword argument 'classifier__eval_metric'
关于 xgboost 的版本:xgboost.__version__ 显示 0.6pip3 freeze | grep xgboost 显示 xgboost==0.6a2。
【问题讨论】:
-
你试过'roc_auc'吗?
-
它适用于 sklearn 版本:0.18 xgboost 版本:0.6
标签: python scikit-learn classification pipeline xgboost