【发布时间】:2017-11-01 12:46:35
【问题描述】:
我正在尝试对收入50k 进行分类并编写交叉验证函数以获得每个准确度
X = df[['age','workclass','fnlwgt','education','marital_status','occupation','relationship','race','sex']]
y = df['income']
k_fold = 10
def k_fold_generator(X, y, k_fold):
subset_size = len(X) / k_fold
for k in range(k_fold):
X_train = X[:k * subset_size] + X[(k + 1) * subset_size:]
X_test = X[k * subset_size:][:subset_size]
y_train = y[:k * subset_size] + y[(k + 1) * subset_size:]
y_test = y[k * subset_size:][:subset_size]
yield X_train, y_train, X_test, y_test
以上都可以
但在
for X_train, y_train, X_test, y_test in k_fold_generator(X, y, k_fold):
print("Error")
TypeError: 不能对切片索引 “类'pandas.core.indexes.numeric.Int64Index'”与这些索引器[0.0] “类'浮动'”
【问题讨论】: