【发布时间】:2020-05-24 18:34:59
【问题描述】:
我写了一些 python 代码来拟合著名的鸢尾花数据集和 KNN 模型,我尝试了不同的 k 值,如 k=2、k=3、k=5,根据我的理解,这些不同的 k 值,混淆矩阵,准确率分数和分类报告值应该不同,但是,无论我给什么k值,统计指标输出都是一样的,而且“精度”,“召回”和“f1-score”都是1.00,如在快照codes and output。我在这里错过了什么吗?谢谢!
from sklearn.model_selection import train_test_split
# first split the dataset into its attributes and labels
X = data.iloc[:, :-1].values
y = data.iloc[:, 4].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30,
random_state=42)
from sklearn.neighbors import KNeighborsClassifier
# Instantiate learning model (k = 5)
clf = KNeighborsClassifier(n_neighbors=5)
# Fitting the model
clf.fit(X_train, y_train)
# Predicting the Test set results
y_pred = clf.predict(X_test)
print(y_pred)
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
print(confusion_matrix(y_test, y_pred))
print(accuracy_score(y_test, y_pred))
print("classification report:---------------------------\n")
print(classification_report(y_test, y_pred, labels=iris.target))
【问题讨论】:
-
您从哪里加载数据?外部 CSV 或
sklearn's内置虹膜数据集。
标签: python machine-learning statistics data-science