一元线性回归
一元线性回归的损失函数是
偏导数分别为
以一组简单的数据为例,具体代码和图像表示如下:
import numpy as np
import matplotlib.pyplot as plt
x = np.array([1., 2., 3., 4., 5.])
y = np.array([1., 3., 2., 3., 5.])
plt.scatter(x , y)
plt.show()
梯度下降
代码表示如下:
a = b = 0 #假设斜率和偏置项均为0
eta = 0.001
epsilon = 1e-8
times = 0
while True:
partial_a = 2 * sum(x * (a*x + b -y))
partial_b = 2 * sum((a * x + b -y))
if abs(partial_a) < epsilon and abs(partial_b) < epsilon:
break
a, b = a - eta * partial_a, b - eta * partial_b
times += 1
print('a is {}, b is {}, times is {}'.format(a, b, times))
结果为a is 0.8000000016366973, b is 0.399999994091004, times is 10136。
随机梯度下降
def loss(x, y, a, b):
return sum((a * x + b - y) ** 2)
eta = 0.01
a = b = 0
epsilon = 1e-5 #如果设为1e-8,很久也跑不出结果
times = 0
while True:
rand = np.random.randint(low=0, high = x.size)
partial_a = 2* x[rand] * (a*x[rand] + b - y[rand])
partial_b = 2 *(a * x[rand] + b -y[rand])
last_loss = loss(x, y, a, b)
print('loss is', last_loss)
a, b = a - eta * partial_a, b - eta * partial_b
cur_loss = loss(x, y, a, b)
if (abs(cur_loss - last_loss) < epsilon):
break
times += 1
结果如下所示:
使用keras,代码如下所示:
import tensorflow as tf
import numpy as np
from tensorflow import keras
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss='mean_squared_error')
xs = np.array([1.0, 2.0, 3.0, 4.0, 5.0], dtype=float)
ys = np.array([1.0, 3.0, 2.0, 3.0, 5.0], dtype=float)
model.fit(xs, ys, epochs=1000)
print(model.predict([7.0]))
结果如下所示:
Epoch 1/1000
5/5 [==============================] - 1s 105ms/sample - loss: 44.1191
Epoch 2/1000
5/5 [==============================] - 0s 1ms/sample - loss: 25.9117
Epoch 3/1000
5/5 [==============================] - 0s 429us/sample - loss: 15.3013
Epoch 4/1000
5/5 [==============================] - 0s 883us/sample - loss: 9.1180
Epoch 5/1000
5/5 [==============================] - 0s 837us/sample - loss: 5.5147
Epoch 6/1000
5/5 [==============================] - 0s 815us/sample - loss: 3.4149
Epoch 7/1000
5/5 [==============================] - 0s 707us/sample - loss: 2.1912
Epoch 8/1000
5/5 [==============================] - 0s 794us/sample - loss: 1.4781
Epoch 9/1000
5/5 [==============================] - 0s 724us/sample - loss: 1.0626
Epoch 10/1000
5/5 [==============================] - 0s 844us/sample - loss: 0.8204
Epoch 11/1000
5/5 [==============================] - 0s 396us/sample - loss: 0.6793
Epoch 12/1000
5/5 [==============================] - 0s 281us/sample - loss: 0.5970
Epoch 13/1000
5/5 [==============================] - 0s 286us/sample - loss: 0.5491
Epoch 14/1000
5/5 [==============================] - 0s 315us/sample - loss: 0.5211
Epoch 15/1000
5/5 [==============================] - 0s 315us/sample - loss: 0.5048
Epoch 16/1000
5/5 [==============================] - 0s 303us/sample - loss: 0.4953
Epoch 17/1000
5/5 [==============================] - 0s 388us/sample - loss: 0.4898
Epoch 18/1000
5/5 [==============================] - 0s 336us/sample - loss: 0.4866
Epoch 19/1000
5/5 [==============================] - 0s 351us/sample - loss: 0.4847
Epoch 20/1000
5/5 [==============================] - 0s 283us/sample - loss: 0.4836
Epoch 21/1000
5/5 [==============================] - 0s 327us/sample - loss: 0.4829
Epoch 22/1000
5/5 [==============================] - 0s 333us/sample - loss: 0.4825
Epoch 23/1000
5/5 [==============================] - 0s 318us/sample - loss: 0.4823
Epoch 24/1000
5/5 [==============================] - 0s 307us/sample - loss: 0.4822
Epoch 25/1000
5/5 [==============================] - 0s 277us/sample - loss: 0.4821
Epoch 26/1000
5/5 [==============================] - 0s 365us/sample - loss: 0.4820
Epoch 27/1000
5/5 [==============================] - 0s 361us/sample - loss: 0.4820
Epoch 28/1000
5/5 [==============================] - 0s 296us/sample - loss: 0.4819
Epoch 29/1000
5/5 [==============================] - 0s 328us/sample - loss: 0.4819
Epoch 30/1000
5/5 [==============================] - 0s 387us/sample - loss: 0.4819
Epoch 31/1000
5/5 [==============================] - 0s 367us/sample - loss: 0.4819
Epoch 32/1000
5/5 [==============================] - 0s 288us/sample - loss: 0.4819
Epoch 33/1000
5/5 [==============================] - 0s 298us/sample - loss: 0.4819
Epoch 34/1000
5/5 [==============================] - 0s 299us/sample - loss: 0.4818
Epoch 35/1000
5/5 [==============================] - 0s 370us/sample - loss: 0.4818
loss基本上不变,原因在于Keras中的SGD指的并不是随机梯度下降,而是小批量随机梯度下降(https://stats.stackexchange.com/questions/221886/how-to-set-mini-batch-size-in-sgd-in-keras)。