【问题标题】:XGBRegressor: change random_state no effectXGBRegressor:改变random_state没有效果
【发布时间】:2018-06-11 20:32:59
【问题描述】:

尽管给出了新的随机种子,xgboost.XGBRegressor 似乎产生了相同的结果。

根据xgboost 文档xgboost.XGBRegressor

seed : int 随机数种子。 (已弃用,请使用 random_state)

random_state : int 随机数种子。 (替换种子)

random_state 是要使用的,但是,无论我使用什么random_stateseed,模型都会产生相同的结果。一个错误?

from xgboost import XGBRegressor
from sklearn.datasets import load_boston
import numpy as np
from itertools import product

def xgb_train_predict(random_state=0, seed=None):
    X, y = load_boston(return_X_y=True)
    xgb = XGBRegressor(random_state=random_state, seed=seed)
    xgb.fit(X, y)
    y_ = xgb.predict(X)
    return y_

check = xgb_train_predict()

random_state = [1, 42, 58, 69, 72]
seed = [None, 2, 24, 85, 96]

for r, s in product(random_state, seed):
    y_ = xgb_train_predict(r, s)
    assert np.equal(y_, check).all()
    print('CHECK! \t random_state: {} \t seed: {}'.format(r, s))

[Out]:
    CHECK!   random_state: 1     seed: None
    CHECK!   random_state: 1     seed: 2
    CHECK!   random_state: 1     seed: 24
    CHECK!   random_state: 1     seed: 85
    CHECK!   random_state: 1     seed: 96
    CHECK!   random_state: 42    seed: None
    CHECK!   random_state: 42    seed: 2
    CHECK!   random_state: 42    seed: 24
    CHECK!   random_state: 42    seed: 85
    CHECK!   random_state: 42    seed: 96
    CHECK!   random_state: 58    seed: None
    CHECK!   random_state: 58    seed: 2
    CHECK!   random_state: 58    seed: 24
    CHECK!   random_state: 58    seed: 85
    CHECK!   random_state: 58    seed: 96
    CHECK!   random_state: 69    seed: None
    CHECK!   random_state: 69    seed: 2
    CHECK!   random_state: 69    seed: 24
    CHECK!   random_state: 69    seed: 85
    CHECK!   random_state: 69    seed: 96
    CHECK!   random_state: 72    seed: None
    CHECK!   random_state: 72    seed: 2
    CHECK!   random_state: 72    seed: 24
    CHECK!   random_state: 72    seed: 85
    CHECK!   random_state: 72    seed: 96

【问题讨论】:

    标签: python-3.x xgboost


    【解决方案1】:

    似乎(在开始寻找答案之前我自己并不知道:)),xgboost 仅将随机生成器用于子采样,请参阅this Laurae's comment on a similar github issue。否则行为是确定性的。

    如果您使用过采样,则 xgboost 中当前 sklearn API 处理 seed/random_state 时会出现问题。 seed 确实声称已弃用,但似乎如果有人提供它,它仍然会在random_state 上使用,如here in the code 所示。此评论仅在您拥有seed not None时才相关

    【讨论】:

    • 谢谢,很容易忽略默认的subsample=1 参数。
    • 您能详细说明一下修复方法吗?即使通过设置seedrandom_statecolsample_bytree,我也无法从 XGBoost 获得可重现的结果
    • 你有什么xgb版本?当使用 random_state 时,来自 OP 添加 subsampling 的代码会在 xgb 0.90 中产生完全可重现的预测
    猜你喜欢
    • 2020-08-29
    • 1970-01-01
    • 1970-01-01
    • 2013-01-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-02-21
    • 1970-01-01
    相关资源
    最近更新 更多