Tensorflow 中的简单线性回归产生接近零的系数答案

【问题标题】：Simple linear regression in Tensorflow produces near zero coefficientTensorflow 中的简单线性回归产生接近零的系数
【发布时间】：2021-11-27 21:04:57
【问题描述】：

我正在Tensorflow 中尝试一个简单的线性回归，只有一个自变量。我的数据图显示系数应该接近 1，事实上，如果我使用 sklearn.linear_model.LinearRegression 运行它，我会得到大约 0.90 的合理结果。

但是，使用this tutorial 在Tensorflow 中运行它会产生非常接近于零的系数。我能够使用随机数字从Tensorflow 获得合理的结果。我尝试调整学习率或时期数，但没有任何有意义的效果。

MRE 包含实际数据，从sklearn 产生的系数应为 0.8975，但从Tensorflow 产生的系数应为 0.00045。我认为它在局部最低限度内被捕获，但我能找到的此类问题的示例都不适用于我的问题。

import numpy as np
import tensorflow as tf
from sklearn import linear_model

learning_rate = 0.1
epochs = 100

x_train = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348, 
                0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182, 
                -0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268, 
                -0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212, 
                -0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534, 
                0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158, 
                0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231, 
                0.00159, -0.00463, 0.00174, 0, -0.0029, 
                -0.00349, 0.01372, -0.00302])

y_train = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441, 
                0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416, 
                -0.00191, -0.00607, 0.00161, 0.00289, -0.00416, 
                0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032, 
                -0.00387, -0.00162, -0.00292, -0.01367, 0.00198, 
                0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164, 
                0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
                -0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196, 
                -0.00065, -0.00391, -0.0108, 0.01291, -0.00098])

regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_)

weight = tf.Variable(0.)
bias = tf.Variable(0.)

for e in range(epochs):
    with tf.GradientTape() as tape:
        y_pred = weight*x_train + bias
        loss = tf.reduce_mean(tf.square(y_pred - y_train))
        gradients = tape.gradient(loss, [weight,bias])
        weight.assign_sub(gradients[0]*learning_rate)
        bias.assign_sub(gradients[1]*learning_rate)

print(weight.numpy(), 'weight', bias.numpy(), 'bias')

【问题讨论】：

标签： python tensorflow2.0

【解决方案1】：

在发布的示例中，训练数据集的 x 和 y 值非常小，这导致梯度非常小，因此当模型在数据上正确训练时，可能需要几百万次迭代，

scikit learn 线性回归模型使用最小二乘曲线拟合，因此可以无限快地拟合数据集。

将结果降低到可管理的 1000 次迭代的建议是应用 MinMaxScaler 使 x 和 y 数据集介于 0 和 1 之间，这将改善梯度并达到训练好的模型，但是您应该对结果进行逆变换训练后返回，如下面修改后的代码所示。

    import numpy as np
    import tensorflow as tf
    from sklearn import linear_model
    from sklearn.preprocessing import MinMaxScaler
    import matplotlib.pyplot as plt
    learning_rate = 0.1
    epochs = 1000
    
    x_train0 = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
                    0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
                    -0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
                    -0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
                    -0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
                    0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
                    0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
                    0.00159, -0.00463, 0.00174, 0, -0.0029,
                    -0.00349, 0.01372, -0.00302])
    scaler1 = MinMaxScaler()
    x_train = scaler1.fit_transform(x_train0.reshape(-1,1))
    y_train0 = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
                    0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
                    -0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
                    0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
                    -0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
                    0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
                    0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
                    -0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
                    -0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
    scaler2 = MinMaxScaler()
    y_train = scaler2.fit_transform(y_train0.reshape(-1,1))
    
    regr = linear_model.LinearRegression()
    regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
    print ('Coefficients: ', regr.coef_, ' intercept ',regr.intercept_, )
    
    weight = tf.Variable(0.)
    bias = tf.Variable(0.)
    
    for e in range(epochs):
        with tf.GradientTape() as tape:
            y_pred = weight*x_train + bias
            loss = tf.reduce_mean(tf.square(y_pred - y_train))
            gradients = tape.gradient(loss, [weight,bias])
            weight.assign_sub(gradients[0]*learning_rate)
            bias.assign_sub(gradients[1]*learning_rate)
    
    
    print(weight.numpy(), 'weight', bias.numpy(), 'bias')
    
    import matplotlib.pyplot as plt
    plt.plot(x_train0,scaler2.inverse_transform(y_pred.numpy()).flatten(),'r',label='model output')
    plt.scatter(x_train0,y_train0,label='training dataset')
    plt.legend()
    plt.show()

系数：[[0.97913471]] 截距 [-0.00420121]

0.96772194 权重 0.0018798028 偏差

【讨论】：