【发布时间】:2018-02-27 02:12:49
【问题描述】:
我正在使用 here 实现的监视器类
class Monitor():
"""Monitor for early stopping in Gradient Boosting for classification.
The monitor checks the validation loss between each training stage. When
too many successive stages have increased the loss, the monitor will return
true, stopping the training early.
Parameters
----------
X_valid : array-like, shape = [n_samples, n_features]
Training vectors, where n_samples is the number of samples
and n_features is the number of features.
y_valid : array-like, shape = [n_samples]
Target values (integers in classification, real numbers in
regression)
For classification, labels must correspond to classes.
max_consecutive_decreases : int, optional (default=5)
Early stopping criteria: when the number of consecutive iterations that
result in a worse performance on the validation set exceeds this value,
the training stops.
"""
def __init__(self, X_valid, y_valid, max_consecutive_decreases=5):
self.X_valid = X_valid
self.y_valid = y_valid
self.max_consecutive_decreases = max_consecutive_decreases
self.losses = []
def __call__(self, i, clf, args):
if i == 0:
self.consecutive_decreases_ = 0
self.predictions = clf._init_decision_function(self.X_valid)
predict_stage(clf.estimators_, i, self.X_valid, clf.learning_rate,
self.predictions)
self.losses.append(clf.loss_(self.y_valid, self.predictions))
if len(self.losses) >= 2 and self.losses[-1] > self.losses[-2]:
self.consecutive_decreases_ += 1
else:
self.consecutive_decreases_ = 0
if self.consecutive_decreases_ >= self.max_consecutive_decreases:
print("f"
"({}): s {}.".format(self.consecutive_decreases_, i)),
return True
else:
return False
params = { 'n_estimators': nEstimators,
'max_depth': maxDepth,
'min_samples_split': minSamplesSplit,
'min_samples_leaf': minSamplesLeaf,
'min_weight_fraction_leaf': minWeightFractionLeaf,
'min_impurity_decrease': minImpurityDecrease,
'learning_rate': 0.01,
'loss': 'quantile',
'alpha': alpha,
'verbose': 0
}
model = ensemble.GradientBoostingRegressor( **params )
model.fit( XTrain, yTrain, monitor = Monitor( XTest, yTest, 25 ) )
效果很好。但是,我不清楚这条线是什么型号
model.fit( XTrain, yTrain, monitor = Monitor( XTest, yTest, 25 ) )
返回:
1) 没有模型
2) 停止前训练的模型
3) 前25次迭代的模型(注意监视器的参数)
如果不是(3),是否可以让estimator返回3?
我该怎么做?
【问题讨论】:
标签: python-2.7 machine-learning scikit-learn