【问题标题】:Make a KNN predictive model with string values?用字符串值制作 KNN 预测模型?
【发布时间】:2022-01-15 23:30:11
【问题描述】:

我想创建一个可以预测装运成功与否的预测模型(目标 = 成员中的成功列),但我的特征是类别而不是浮动,这给了我一个错误。是否可以制作出我想要制作的预测模型?

import pandas as pd 

from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

members = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/members.csv")
expeditions = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/expeditions.csv", parse_dates=['basecamp_date','highpoint_date','termination_date'])
peaks = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/peaks.csv")



members_sk = members
members_sk2 = pd.merge(members_sk, expeditions[["expedition_id", "nbre_total_membres"]], on = "expedition_id", how="inner")
members_sk3 = pd.merge(members_sk2, peaks[["peak_id", "height_metres"]], on = "peak_id", how="inner")

members_bis = members_sk3[["peak_name","season", "sex", 'age',"citizenship","expedition_role","hired","solo", "oxygen_used", "success", "nbre_total_membres", "height_metres"]]
members_bis = members_bis.dropna()

x = members_bis.drop("success", 1)
y = members_bis["success"]
xtrain, xtest, ytrain, ytest = train_test_split(x,y,test_size=0.35, random_state=1)


model_KNN = KNeighborsClassifier(n_neighbors = 10)
model_KNN.fit(xtrain, train)
ypredict_KNN = model_KNN.predict(xtest)
print(ypredict_KNN, type(ypredict_KNN))

【问题讨论】:

    标签: python pandas machine-learning scikit-learn


    【解决方案1】:

    要训练模型,您必须将 分类 特征转换为 数字 有很多方法可以做到这一点,两种常用方法是应用 oneHotEncoder 或使用labelEncoder.

    【讨论】:

    • 我可以将它应用到我的整个数据框,它会知道要更改哪一列吗?
    • 您必须将其应用于“peak_name”、“season”、“sex”、“citizenship”等分类列。
    猜你喜欢
    • 2017-11-01
    • 1970-01-01
    • 2016-10-15
    • 1970-01-01
    • 2018-10-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-05-26
    相关资源
    最近更新 更多