【问题标题】:How to extract the boundary values from k-nearest neighbors predict如何从 k 最近邻预测中提取边界值
【发布时间】:2021-01-31 13:17:12
【问题描述】:

MRE

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

# prepare data
iris = load_iris()
X = iris.data
y = iris.target
df = pd.DataFrame(X, columns=iris.feature_names)
df['label'] = y
species_map = dict(zip(range(3), iris.target_names))
df['species'] = df.label.map(species_map)
df = df.reindex(['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)', 'species', 'label'], axis=1)

# instantiate model
knn = KNeighborsClassifier(n_neighbors=6)

# predict for 'petal length (cm)' and 'petal width (cm)'
knn.fit(df.iloc[:, 2:4], df.label)

h = .02  # step size in the mesh

# create colormap for the contour plot
cmap_light = ListedColormap(list(sns.color_palette('pastel', n_colors=3)))

# Plot the decision boundary.
# For that, we will assign a color to each point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = df['petal length (cm)'].min() - 1, df['petal length (cm)'].max() + 1
y_min, y_max = df['petal width (cm)'].min() - 1, df['petal width (cm)'].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = knn.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)

# create plot
fig, ax = plt.subplots()

# add data points
sns.scatterplot(data=df, x='petal length (cm)', y='petal width (cm)', hue='species', ax=ax, edgecolor='k')

# add decision boundary countour map
ax.contourf(xx, yy, Z, cmap=cmap_light, alpha=0.4)

# legend
lgd = plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

plt.show()

结果图

想要的情节

  • 不是颜色或样式,只是它只有决策边界和数据点。

资源

SO Question that doesn't answer the question

自我回答

  • 我提供了一个解决方案,但我不确定它是否是最佳解决方案。我当然愿意接受其他选择。
  • 也就是说,我不想要在contourfpcolormesh 图中有颜色的解决方案。
  • 简而言之,最佳解决方案是仅提取决策边界值。

【问题讨论】:

    标签: python numpy matplotlib scikit-learn knn


    【解决方案1】:
    • 这是我提出的一种解决方案,它沿Z 的两个轴使用np.diff,即.predict 结果。这个想法是,只要结果发生变化,那就是决策边界。
      • 使用.diff 从自身中减去Z,移动1。
      • 创建mask,使用np.diff(Z) != 0
      • 使用maskxxyy中选择合适的xy
    • 使用 OP 中的现有代码
    # use diff to create a mask
    mask = np.diff(Z, axis=1) != 0
    mask2 = np.diff(Z, axis=0) != 0
    
    # apply mask against xx and yy
    xd = np.concatenate((xx[:, 1:][mask], xx[1:, :][mask2]))
    yd = np.concatenate((yy[:, 1:][mask], yy[1:, :][mask2]))
    
    # plot just the decision boundary
    fig, ax = plt.subplots()
    sns.scatterplot(x=xd, y=yd, color='k', edgecolor='k', s=5, ax=ax, label='decision boundary')
    plt.show()
    

    fig, ax = plt.subplots()
    sns.scatterplot(data=df, x='petal length (cm)', y='petal width (cm)', hue='species', ax=ax, edgecolor='k')
    sns.scatterplot(x=xd, y=yd, color='k', edgecolor='k', s=5, ax=ax, label='decision boundary')
    lgd = plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
    

    xdyd 正确覆盖 plt.contourf

    【讨论】:

      猜你喜欢
      • 2023-01-24
      • 2012-07-19
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-05-12
      • 2018-12-15
      • 2016-05-02
      • 2014-04-12
      相关资源
      最近更新 更多