【发布时间】:2021-10-21 21:45:39
【问题描述】:
我正在尝试优化 XGB 回归模型的参数学习率和 max_depth:
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score
from xgboost import XGBRegressor
param_grid = [
# trying learning rates from 0.01 to 0.2
{'eta ':[0.01, 0.05, 0.1, 0.2]},
# and max depth from 4 to 10
{'max_depth': [4, 6, 8, 10]}
]
xgb_model = XGBRegressor(random_state = 0)
grid_search = GridSearchCV(xgb_model, param_grid, cv=5,
scoring='neg_root_mean_squared_error',
return_train_score=True)
grid_search.fit(final_OH_X_train_scaled, y_train)
final_OH_X_train_scaled 是只包含数字特征的训练数据集。
y_train 是训练标签 - 也是数字。
这是返回错误:
FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan.
我看过其他类似的问题,但还没有找到答案。
也尝试过:
param_grid = [
# trying learning rates from 0.01 to 0.2
# and max depth from 4 to 10
{'eta ': [0.01, 0.05, 0.1, 0.2], 'max_depth': [4, 6, 8, 10]}
]
但它会产生同样的错误。
编辑: 以下是数据示例:
final_OH_X_train_scaled.head()
y_train.head()
EDIT2:
可以通过以下方式检索数据样本:
final_OH_X_train_scaled = pd.DataFrame([[0.540617 ,1.204666 ,1.670791 ,-0.445424 ,-0.890944 ,-0.491098 ,0.094999 ,1.522411 ,-0.247443 ,-0.559572 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0],
[0.117467 ,-2.351903 ,0.718969 ,-0.119721 ,-0.874705 ,-0.530832 ,-1.385230 ,2.126612 ,-0.947731 ,-0.156967 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0],
[0.901138 ,-0.208256 ,-0.019134 ,0.265250 ,-0.889128 ,-0.467753 ,0.169306 ,-0.973256 ,0.056164 ,-0.671978 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0],
[2.074639 ,0.100602 ,-1.645121 ,0.929598 ,0.811911 ,1.364560 ,0.337242 ,0.435187 ,-0.388075 ,1.279959 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0],
[2.198099 ,-0.496254 ,-0.917933 ,-1.418407 ,-0.975889 ,1.044495 ,0.254181 ,1.335285 ,2.079415 ,2.071974 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0]],
columns=['cont0' ,'cont1' ,'cont2' ,'cont3' ,'cont4' ,'cont5' ,'cont6' ,'cont7' ,'cont8' ,'cont9' ,'31' ,'32' ,'33' ,'34' ,'35' ,'36' ,'37' ,'38' ,'39' ,'40'])
【问题讨论】:
-
对我来说没有什么明显的错误。您能否发布几行您的
final_OH_X_train_scaled和y_train数据,以便我们重现和调试?您的数据可能有问题。 -
@TCArlen 非常感谢您的反馈。请在上面查看我的编辑
-
太好了,谢谢。但是,为了在我的机器上进行检查和重现/调试,我需要将训练数据行/标签作为代码/数据,这样我就可以自己运行它。您可以将其发布为数据而不是屏幕截图吗?
-
链接中的数据不是上面截图中显示的方式转换的数据,来自
final_OH_X_train_scaled.head()。请将这些值放入此示例问题中的代码中:stackoverflow.com/questions/68732791/… 您是否看到数据帧是如何从代码构造的,因此它是另一个机器上可重现的示例?谢谢 -
好的,请看上面
标签: python scikit-learn xgboost