【发布时间】:2021-06-02 09:57:40
【问题描述】:
我在下面使用随机搜索的xgboost参数调整代码
import xgboost as xgb
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import r2_score
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import fbeta_score, make_scorer
from xgboost.sklearn import XGBRegressor
parameters = {'objective':['reg:squarederror'],
'booster':['gbtree','gblinear'],
'learning_rate': [0.1],
'max_depth': [7,10,15,20],
'min_child_weight': [10,15,20,25],
'colsample_bytree': [0.8, 0.9, 1],
'n_estimators': [300,400,500,600],
"reg_alpha" : [0.5,0.2,1],
"reg_lambda" : [2,3,5],
"gamma" : [1,2,3]}
xgb_model = XGBRegressor(random_state=30)
grid_obj_xgb = RandomizedSearchCV(xgb_model,parameters, cv=5,n_iter=15,scoring='neg_mean_absolute_error',verbose=5,n_jobs=12)
grid_obj_xgb.fit(df_train, y_train,verbose = 1)
y_pred_train = grid_obj_xgb.predict(df_train)
y_pred_test = grid_obj_xgb.predict(df_test)
err_xgb_train=mean_absolute_error(y_train, y_pred_train, multioutput='raw_values')
我的 df_train 中有 1,200,000 行,第 75 列,这需要很长时间核心 或者考虑到我拥有的数据量,这需要时间。
我能够运行一次迭代并能够得到结果,所以我的代码没有错,但是对于一次迭代也需要很多时间。
【问题讨论】:
标签: python machine-learning scikit-learn xgboost gridsearchcv