【发布时间】:2018-10-29 14:39:09
【问题描述】:
我正在使用泰坦尼克号数据集来预测乘客是否幸存,或者没有使用随机森林。这是我的代码:
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn import cross_validation
import matplotlib.pyplot as plt
%matplotlib inline
data=pd.read_csv("C:\\Users\\kabala\\Downloads\\Titanic.csv")
data.isnull().any()
data["Age"]=data1["Age"].fillna(data1["Age"].median())
data["PClass"]=data["PClass"].fillna("3rd")
data["PClass"].isnull().any()
data1.isnull().any()
pd.get_dummies(data.Sex)
# choosing the predictive variables
x=data[["PClass","Age","Sex"]]
# the target variable is y
y=data["Survived"]
modelrandom=RandomForestClassifier(max_depth=3)
modelrandom=cross_validation.cross_val_score(modelrandom,x,y,cv=5)
但是,我不断收到此错误:
ValueError: could not convert string to float: 'female'
我不明白问题出在哪里,因为我将 Sex 功能更改为假人
谢谢:)
【问题讨论】:
标签: python scikit-learn anaconda data-analysis cross-validation