【发布时间】:2021-09-07 07:46:25
【问题描述】:
目前正在使用 iris 数据集进行分类练习,我已经到了一个我不确定发生了什么的地步。我想我正在将一朵新花的假设尺寸传递给模型,它会输出模型认为这朵花是什么的预测,但我不确定。
我发布了所有代码,但我关心的部分在这里:
species_id = clfr.predict([[1, 5, 4, 6]])
iris.target_names[species_id]
print(iris.target_names[species_id])
这是我所有的代码:
# Importing required libraries
import numpy as np
import pandas as pd
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.datasets import load_iris
import sklearn.metrics as metrics
# Loading datasets
iris = load_iris()
# Convert to pandas dataframe
iris_data = pd.DataFrame({
'sepal length':iris.data[:,0],
'sepal width':iris.data[:,1],
'petal length':iris.data[:,2],
'petal width':iris.data[:,3],
'species':iris.target
})
iris_data.head()
# printing categories (setosa, versicolor, virginica)
print(iris.target_names)
# print flower features
print(iris.feature_names)
# setting independent (X) and dependent (Y) variables
X = iris_data[['sepal length', 'sepal width', 'petal length', 'petal width']] # Features
Y = iris_data['species'] # Labels
# printing feature data
print(X[0:5])
# printing dependent variable values (0 = setosa, 1 = versicolor, 3 = virginica)
print(Y)
# splitting into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state = 100)
# defining random forest classifier
clfr = RandomForestClassifier(random_state = 100)
clfr.fit(X_train, y_train)
# making prediction
Y_pred = clfr.predict(X_test)
# checking model accuracy
print("Accuracy:", metrics.accuracy_score(y_test, Y_pred))
cm = np.array(confusion_matrix(y_test, Y_pred))
print(cm)
# making predictions on new data
species_id = clfr.predict([[1, 5, 4, 6]])
iris.target_names[species_id]
print(iris.target_names[species_id])
【问题讨论】:
-
I think I'm passing the hypothetical dimensions of a new flower into the model and it's outputting a prediction for what the model believes the flower is。你是对的。您在特征['sepal length', 'sepal width', 'petal length', 'petal width']上训练模型,因此[1, 5, 4, 6]是您的模型尝试预测其物种的新“未知”花的各自值。 -
如果您回答而不是评论,我想我可以将您的回答标记为正确
标签: python scikit-learn classification