【发布时间】:2017-03-27 00:08:15
【问题描述】:
当 alpha 参数接近零时,Tikhonov (ridge) 成本等于最小二乘成本。 scikit-learn docs about the subject 上的所有内容都表示相同。所以我期待
sklearn.linear_model.Ridge(alpha=1e-100).fit(data, target)
等价于
sklearn.linear_model.LinearRegression().fit(data, target)
但事实并非如此。为什么?
已更新代码:
import pandas as pd
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
%matplotlib inline
dataset = pd.read_csv('house_price_data.csv')
X = dataset['sqft_living'].reshape(-1, 1)
Y = dataset['price'].reshape(-1, 1)
polyX = PolynomialFeatures(degree=15).fit_transform(X)
model1 = LinearRegression().fit(polyX, Y)
model2 = Ridge(alpha=1e-100).fit(polyX, Y)
plt.plot(X, Y,'.',
X, model1.predict(polyX),'g-',
X, model2.predict(polyX),'r-')
注意:alpha=1e-8 或 alpha=1e-100 的情节看起来相同
【问题讨论】:
标签: python machine-learning scikit-learn regression linear-regression