【问题标题】:使用具有不同 k 值的 k-nn 绘制图形
【发布时间】:2021-01-03 03:32:34
【问题描述】:

我想为 k-nn 分类器绘制具有不同 k 值的图形。 我的问题是这些数字似乎具有相同的 k 值。 到目前为止,我尝试的是在循环中的每次运行中更改 k 的值:

clf = KNeighborsClassifier(n_neighbors=counter+1) 但是所有的数字似乎都是针对k=1

from sklearn.datasets import fetch_california_housing
data = fetch_california_housing()
import numpy as np
from sklearn.model_selection import train_test_split

c = np.array([1 if y > np.median(data['target']) else 0 for y in data['target']])
X_train, X_test, c_train, c_test = train_test_split(data['data'], c, random_state=0)

from sklearn.neighbors import KNeighborsClassifier
import mglearn
import matplotlib.pyplot as plt

fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(20, 6))
for counter in range(3):      
    clf = KNeighborsClassifier(n_neighbors=counter+1) 
    clf.fit(X_test, c_test)
    plt.tight_layout()  # this will help create proper spacing between the plots.
    mglearn.discrete_scatter(X_test[:,0], X_test[:,1], c_test, ax=ax[counter])
    plt.legend(["Class 0", "Class 1"], loc=4)
    plt.xlabel("First feature")
    plt.ylabel("Second feature")
    #plt.figure()

【问题讨论】:

    标签: python machine-learning knn


    【解决方案1】:

    所有图看起来都一样的原因是您只是每次都在绘制测试集,而不是在测试集上绘制模型预测。您可能打算对k 的每个值执行以下操作:

    • 使模型适合训练集,在这种情况下,您应该将 clf.fit(X_test, c_test) 替换为 clf.fit(X_train, c_train)

    • 在测试集上生成模型预测,在这种情况下你应该添加c_pred = clf.predict(X_test)

    • 在测试集上绘制模型预测,在这种情况下,您应该在散点图中将c_test 替换为c_pred,即使用mglearn.discrete_scatter(X_test[:, 0], X_test[:, 1], c_pred, ax=ax[counter]) 而不是mglearn.discrete_scatter(X_test[:, 0], X_test[:, 1], c_test, ax=ax[counter])

    更新代码:

    from sklearn.datasets import fetch_california_housing
    from sklearn.model_selection import train_test_split
    from sklearn.neighbors import KNeighborsClassifier
    import numpy as np
    import mglearn
    import matplotlib.pyplot as plt
    
    data = fetch_california_housing()
    
    c = np.array([1 if y > np.median(data['target']) else 0 for y in data['target']])
    
    X_train, X_test, c_train, c_test = train_test_split(data['data'], c, random_state=0)
    
    fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(20, 6))
    
    for counter in range(3):
    
        clf = KNeighborsClassifier(n_neighbors=counter+1)
    
        # fit the model to the training set
        clf.fit(X_train, c_train)
    
        # extract the model predictions on the test set
        c_pred = clf.predict(X_test)
    
        # plot the model predictions
        plt.tight_layout()
        mglearn.discrete_scatter(X_test[:,0], X_test[:,1], c_pred, ax=ax[counter])
        plt.legend(["Class 0", "Class 1"], loc=4)
        plt.xlabel("First feature")
        plt.ylabel("Second feature")
    

    【讨论】:

      猜你喜欢
      • 2020-09-18
      • 1970-01-01
      • 2017-12-17
      • 2018-03-17
      • 2018-04-04
      • 1970-01-01
      • 2021-05-09
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多