【发布时间】:2016-05-17 00:37:57
【问题描述】:
这里是数据文件here 和here。您可以通过单击链接下载它。我正在使用 Pandas、Numpy 和 Python3。
这是我的代码:
import pandas as pa
import numpy as nu
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
def get_accuracy(X_train, y_train, X_test, y_test):
perceptron = Perceptron()
perceptron.fit(X_train, y_train)
perceptron.transform(X_train)
prediction = perceptron.predict(X_test)
result = accuracy_score(y_test, prediction)
return result
test_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-test.csv")
test_data.columns = ["class", "f1", "f2"]
train_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-train.csv")
train_data.columns = ["class", "f1", "f2"]
scaler = StandardScaler()
scaler.fit_transform(train_data[train_data.columns[1:]]).reshape(-1,1)
X_train = scaler.transform(train_data[train_data.columns[1:]])
scaler.fit_transform(train_data[train_data.columns[0]])
y_train = scaler.transform(train_data[train_data.columns[0]])
scaler.fit_transform(test_data[test_data.columns[1:]])
X_test = scaler.transform(test_data[test_data.columns[1:]])
scaler.fit_transform(test_data[test_data.columns[0]])
y_test = scaler.transform(test_data[test_data.columns[0]])
scaled_accuracy = get_accuracy(nu.ravel(X_train), nu.ravel(y_train), nu.ravel(X_test), nu.ravel(y_test))
print(scaled_accuracy)
这是我得到的错误:
Traceback (most recent call last):
File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 33, in <module>
scaled_accuracy = get_accuracy(nu.ravel(X_train), nu.ravel(y_train), nu.ravel(X_test), nu.ravel(y_test))
File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 9, in get_accuracy
perceptron.fit(X_train, y_train)
File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\linear_model\stochastic_gradient.py", line 545, in fit
sample_weight=sample_weight)
File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\linear_model\stochastic_gradient.py", line 389, in _fit
X, y = check_X_y(X, y, 'csr', dtype=np.float64, order="C")
File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\utils\validation.py", line 520, in check_X_y
check_consistent_length(X, y)
File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\utils\validation.py", line 176, in check_consistent_length
"%s" % str(uniques))
**ValueError: Found arrays with inconsistent numbers of samples: [ 1 299]**
如果不缩放数据,一切正常。但缩放后没有。
【问题讨论】:
-
您能分享一下您的 CSV 文件的内容吗?我的意思是,如果没有数据,就无法复制输出,你看!
-
调用
fit_transform返回缩放后的数据;尝试将您的fit_transforms设置为等于您的 X 和 y 训练/测试对象
标签: python numpy pandas machine-learning scikit-learn