【发布时间】:2021-04-13 05:38:34
【问题描述】:
我对 GridsearchCV 在其参数搜索中使用的指标感到困惑。我的理解是我的模型对象为它提供了一个指标,这就是用来确定“best_params”的。但情况似乎并非如此。我认为 score=None 是默认值,因此使用了 model.compile() 的 metrics 选项中给出的第一个指标。所以在我的例子中,使用的评分函数应该是 mean_squred_error。接下来描述我对这个问题的解释。
这就是我正在做的事情。我使用 sklearn 在 100,000 次观察中模拟了一些回归数据,其中包含 10 个特征。我在玩 keras,因为我过去通常使用 pytorch,直到现在才真正涉足 keras。在我拥有一组最佳参数后,我注意到 GridsearchCV 调用与 model.fit() 调用的损失函数输出存在差异。现在我知道我可以 refit=True 并且不再重新拟合模型,但我正在尝试了解 keras 和 sklearn GridsearchCV 函数的输出。
要明确说明这里的差异是我所看到的。我使用sklearn模拟了一些数据如下:
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X, y = make_regression(n_samples=N, n_features=feats, n_informative=2, noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
我创建了一个“create_model”函数来调整我正在使用的激活函数(这也是一个简单的概念证明示例)。
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30, input_dim=feats, activation=activation_fn,
kernel_initializer='normal'))
model.add(Dropout(0.2))
model.add(Dense(10, activation=activation_fn))
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
# Compile model
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['mean_squared_error','mae'])
return model
执行网格搜索我得到以下输出
model = KerasRegressor(build_fn=create_model, epochs=50, batch_size=200, verbose=0)
activations = ['linear','relu']
param_grid = dict(activation_fn = activations)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X_train, y_train, verbose=1)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
Best: -21.163454 using {'activation_fn': 'linear'}
好的,所以最好的指标是 21.16 的均方误差(我知道他们翻转符号以创建最大化问题)。因此,当我使用 activation_fn = 'linear' 拟合模型时,我得到的 MSE 完全不同。
best_model = create_model('linear')
history = best_model.fit(X_train, y_train, epochs=50, batch_size=200, verbose=1)
.....
.....
Epoch 49/50
8000/8000 [==============================] - 0s 48us/step - loss: 344.1636 - mean_squared_error: 344.1636 - mean_absolute_error: 12.2109
Epoch 50/50
8000/8000 [==============================] - 0s 48us/step - loss: 326.4524 - mean_squared_error: 326.4524 - mean_absolute_error: 11.9250
history.history['mean_squared_error']
Out[723]:
[10053.778002929688,
9826.66806640625,
......
......
344.16363830566405,
326.45237121582034]
区别在于 326.45 与 21.16。任何关于我误解的见解将不胜感激。如果它们彼此位于合理的邻域内,我会更舒服,因为其中一个是与整个训练数据集相比的一倍误差。但是 21 远不及 326。谢谢!
这里可以看到完整的代码。
import pandas as pd
import numpy as np
from keras import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
from keras.constraints import maxnorm
from sklearn import preprocessing
from sklearn.preprocessing import scale
from sklearn.datasets import make_regression
from matplotlib import pyplot as plt
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X, y = make_regression(n_samples=N, n_features=feats, n_informative=2, noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30, input_dim=feats, activation=activation_fn,
kernel_initializer='normal'))
model.add(Dropout(0.2))
model.add(Dense(10, activation=activation_fn))
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
# Compile model
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['mean_squared_error','mae'])
return model
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# create model
model = KerasRegressor(build_fn=create_model, epochs=50, batch_size=200, verbose=0)
# define the grid search parameters
activations = ['linear','relu']
param_grid = dict(activation_fn = activations)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X_train, y_train, verbose=1)
best_model = create_model('linear')
history = best_model.fit(X_train, y_train, epochs=50, batch_size=200, verbose=1)
history.history.keys()
plt.plot(history.history['mean_absolute_error'])
# summarize results
grid_result.cv_results_
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
【问题讨论】:
标签: keras gridsearchcv