【问题标题】:How to plot sklearn's GridSearchCV results vs params?如何绘制 sklearn 的 GridSearchCV 结果与参数?
【发布时间】:2020-06-18 13:20:37
【问题描述】:
def show3D(searcher, grid_param_1, grid_param_2, name_param_1, name_param_2, rot=0):
    scores_mean = searcher.cv_results_['mean_test_score']
    scores_mean = np.array(scores_mean).reshape(len(grid_param_2), len(grid_param_1))

    scores_sd = searcher.cv_results_['std_test_score']
    scores_sd = np.array(scores_sd).reshape(len(grid_param_2), len(grid_param_1))

    print('Best params = {}'.format(searcher.best_params_))
    print('Best score = {}'.format(scores_mean.max()))

    _, ax = plt.subplots(1,1)

    # Param1 is the X-axis, Param 2 is represented as a different curve (color line)
    for idx, val in enumerate(grid_param_2):
        ax.plot(grid_param_1, scores_mean[idx, :], '-o', label=name_param_2 + ': ' + str(val))

    ax.tick_params(axis='x', rotation=rot)
    ax.set_title('Grid Search Scores')
    ax.set_xlabel(name_param_1)
    ax.set_ylabel('CV score')
    ax.legend(loc='best')
    ax.grid('on')

from sklearn.linear_model import SGDClassifier

metrics = ['hinge', 'log', 'modified_huber', 'perceptron', 'huber', 'epsilon_insensitive']
penalty = ['l2', 'l1', 'elasticnet']
searcher = GridSearchCV(SGDClassifier(max_iter=10000), {'loss': metrics,
                                                        'penalty': penalty},
                        scoring='roc_auc')

searcher.fit(train_x, train_y)
show3D(searcher, metrics, penalty, 'loss', 'penalty', 80)
searcher.cv_results_['mean_test_score']

图表显示最优值是 huber + l2,但是 best_params 给出了不同的结果,这怎么可能?绘图似乎是正确的,取自这里:How to graph grid scores from GridSearchCV?

【问题讨论】:

    标签: python-3.x matplotlib machine-learning scikit-learn gridsearchcv


    【解决方案1】:

    best_params 是正确的,因为它们来自 searcher.best_params_show3D 必须更新,因为 cv 结果被错误地分配给参数:

    def show3D(searcher, grid_param_1, grid_param_2, name_param_1, name_param_2, rot=0):
        scores_mean = searcher.cv_results_['mean_test_score']
        scores_mean = np.array(scores_mean).reshape(len(grid_param_1), len(grid_param_2)).T
    
        print('Best params = {}'.format(searcher.best_params_))
        print('Best score = {}'.format(scores_mean.max()))
    
        _, ax = plt.subplots(1,1)
    
        # Param1 is the X-axis, Param 2 is represented as a different curve (color line)
        for idx, val in enumerate(grid_param_2):
            ax.plot(grid_param_1, scores_mean[idx, :], '-o', label=name_param_2 + ': ' + str(val))
    
        ax.tick_params(axis='x', rotation=rot)
        ax.set_title('Grid Search Scores')
        ax.set_xlabel(name_param_1)
        ax.set_ylabel('CV score')
        ax.legend(loc='best')
        ax.grid('on')
    
    from sklearn.linear_model import SGDClassifier
    from sklearn.model_selection import GridSearchCV
    from sklearn.datasets import make_classification
    
    train_x, train_y = make_classification(10000,10,2)
    
    grid_param_1 = ['hinge', 'log', 'modified_huber', 'perceptron', 'huber', 'epsilon_insensitive']
    grid_param_2 = ['l2', 'l1', 'elasticnet']
    searcher = GridSearchCV(SGDClassifier(max_iter=10000), param_grid = {'loss': grid_param_1,
                                                                         'penalty': grid_param_2},
                            scoring='roc_auc')
    
    searcher.fit(train_x, train_y)
    searcher.best_params_
    
    show3D(searcher, grid_param_1, grid_param_2, 'loss', 'penalty', 80)
    searcher.cv_results_['mean_test_score']
    
    Best params = {'loss': 'huber', 'penalty': 'elasticnet'}
    Best score = 0.9730321844671845
    array([0.97055738, 0.97121098, 0.97126158, 0.97163018, 0.97188638,
           0.97186598, 0.96557938, 0.97176798, 0.97196198, 0.95864618,
           0.96608918, 0.92235953, 0.96921638, 0.97070898, 0.97303218,
           0.96587218, 0.97211978, 0.96902218])
    

    有点丑陋的手动证明参数{'loss': 'huber', 'penalty': 'elasticnet'}确实产生了最高的cv结果:

    searcher.cv_results_['params'][np.argmax(searcher.cv_results_['mean_test_score'])]
    {'loss': 'huber', 'penalty': 'elasticnet'}
    

    【讨论】:

      猜你喜欢
      • 2017-05-29
      • 2020-01-16
      • 2020-09-26
      • 2020-10-03
      • 2018-03-21
      • 2017-08-23
      • 2021-05-30
      • 2014-01-29
      相关资源
      最近更新 更多