【发布时间】:2020-07-22 14:45:05
【问题描述】:
我正在学习有关 Python 中的线性回归和机器学习的教程,并决定更进一步,看看我得到的正确率中有多少是错误的。我发现我的很多预测都是错误的(我将它们四舍五入,所以即使它们有很多小数位,它们也会被标记为正确)。有谁知道为什么会这样?非常感谢!
我的代码在这里:
import pandas as pd
import numpy as np
import sklearn
from sklearn import linear_model
from sklearn.utils import shuffle
data = pd.read_csv('student-mat.csv', sep=';')
data = data[['G1', 'G2', 'G3', 'failures', 'absences', 'studytime', 'freetime', 'goout']]
predict = 'G3'
att = np.array(data.drop([predict], 1))
lab = np.array(data[predict])
att_train, att_test, lab_train, lab_test = sklearn.model_selection.train_test_split(att, lab, test_size=0.1)
linear = linear_model.LinearRegression()
linear.fit(att_train, lab_train)
acc = linear.score(att_test, lab_test)
print('Accuracy of the test: ' + str(acc) + '\n')
predictions = linear.predict(att_test)
print()
right_counter = 0
wrong_counter = 0
for b in range(len(predictions) - 1):
print(predictions[b], att_test[b], lab_test[b])
if round(predictions[b]) == lab_test[b]:
print("you're right")
right_counter += 1
else:
print("you're wrong")
wrong_counter += 1
print(f'Record: {right_counter} - {wrong_counter}')
【问题讨论】:
标签: python numpy machine-learning linear-regression