【问题标题】:How to Predict Polynomial Regression Python with Nominal Data type如何使用标称数据类型预测多项式回归 Python
【发布时间】:2020-12-24 15:38:52
【问题描述】:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.preprocessing import PolynomialFeatures

df = pd.read_csv("diamonds.csv")

df = pd.get_dummies(df, columns = ["color", "clarity", "cut"])

X, Y = df.drop(labels = ["price", "color_E", "clarity_VS2", "cut_Good"], axis = 1).values, df[["price"]].values


pf = PolynomialFeatures(degree = 2, include_bias = False)
pf.fit(X_train)
pf.transform(X_train)

pf.transform(X_train)

X_train_transformed = pf.transform(X_train)
X_test_transformed = pf.transform(X_test)

modelR = LinearRegression()
modelR.fit(X_train_transformed, Y_train)


predictionlist = [0.23, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 61.5, 55, 3.47, 3.58, 1.57]

print("Polynomial Regression score: " + str(modelR.score(X_test_transformed, Y_test)) + " prediction: " + str(modelR.predict(pf.fit_transform([predictionlist]))[0][0]))

这是输出:

多项式回归分数:0.96599715147751 预测:-16308769.231718607

我的多项式回归的分数很好但我的预测很糟糕,钻石的价格怎么会是-16308769.231718607

我认为我的预测列表非常混乱

【问题讨论】:

    标签: python machine-learning linear-regression sklearn-pandas polynomials


    【解决方案1】:

    你搞砸了你的 pf.transform。当打印你的预测时 fit_transform,基本上你只在一个实例上拟合你的转换,你想要预测的那个。只需在您的训练集上进行 fit_transform 转换,只需转换您的测试集并转换您的预测列表。

    【讨论】:

    • 你的改变后我的预测是一样的
    • 预测列表是您数据的单个实例吗?
    猜你喜欢
    • 2022-01-03
    • 1970-01-01
    • 2019-07-29
    • 2019-03-01
    • 2023-04-10
    • 2012-12-03
    • 2019-01-24
    • 2020-07-17
    • 2014-12-18
    相关资源
    最近更新 更多