【问题标题】:Why is TensorFlow estimator not able to make this simple linear regression prediction为什么 TensorFlow 估计器无法进行这种简单的线性回归预测
【发布时间】:2018-07-28 16:09:04
【问题描述】:

我目前正在学习 tensorflow,无法理解为什么 tensorflow 不能对以下简单回归问题进行正确预测。

X 是从 1000 到 8000 的随机数 Y 为 X + 250

所以如果 X 是 2000,Y 是 2250。这对我来说似乎是一个线性回归问题。然而,当我尝试进行预测时,它与我的预期相差甚远,X of 1000 给我的预测是 1048 而不是 1250。

损失和平均损失也很大:

{'average_loss': 10269.81, 'loss': 82158.48, 'global_step': 1000}

完整代码如下:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn.model_selection import train_test_split

x_data = np.random.randint(1000, 8000, 1000000)
y_true = x_data + 250

feat_cols = [tf.feature_column.numeric_column('x', shape=[1])]
estimator = tf.estimator.LinearRegressor(feature_columns=feat_cols)

x_train, x_eval, y_train, y_eval = train_test_split(x_data, y_true, test_size=0.3, random_state=101)

input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=8, num_epochs=None, shuffle=True)
train_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=8, num_epochs=1000, shuffle=False)
eval_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_eval}, y_eval, batch_size=8, num_epochs=1000, shuffle=False)

estimator.train(input_fn=input_func, steps=1000)

train_metrics = estimator.evaluate(input_fn=train_input_func, steps=1000)
eval_metrics = estimator.evaluate(input_fn=eval_input_func, steps=1000)

print(train_metrics)
print(eval_metrics)

brand_new_data = np.array([1000, 2000, 7000])
input_fn_predict = tf.estimator.inputs.numpy_input_fn({'x': brand_new_data}, shuffle=False)

prediction_result = estimator.predict(input_fn=input_fn_predict)

print(list(prediction_result))

是我做错了什么还是我误解了 LinearRegression 的含义?

【问题讨论】:

    标签: python tensorflow machine-learning


    【解决方案1】:

    我认为当您调整一些超参数时会发生这种情况。我还将优化器更改为 AdamOptimizer

    主要batch size为1,epochs为None

    train_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=1, num_epochs=None, 洗牌=真)

    代码:

    import tensorflow as tf
    import numpy as np
    from sklearn.model_selection import train_test_split
    
    x_data = np.random.randint(1000, 8000, 10000)
    y_true = x_data + 250
    
    
    feat_cols = tf.feature_column.numeric_column('x')
    optimizer = tf.train.AdamOptimizer(1e-3)
    
    estimator = tf.estimator.LinearRegressor(feature_columns=[feat_cols],optimizer=optimizer)
    
    x_train, x_eval, y_train, y_eval = train_test_split(x_data, y_true, test_size=0.3, random_state=101)
    
    
    train_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=1, num_epochs=None,
                                                          shuffle=True)
    
    eval_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_eval}, y_eval, batch_size=1, num_epochs=None,
                                                         shuffle=True)
    
    estimator.train(input_fn=train_input_func, steps=1005555)
    
    train_metrics = estimator.evaluate(input_fn=train_input_func, steps=10000)
    eval_metrics = estimator.evaluate(input_fn=eval_input_func, steps=10000)
    
    print(train_metrics)
    print(eval_metrics)
    
    brand_new_data = np.array([1000, 2000, 7000])
    input_fn_predict = tf.estimator.inputs.numpy_input_fn({'x': brand_new_data}, num_epochs=1,shuffle=False)
    
    prediction_result = estimator.predict(input_fn=input_fn_predict)
    
    for prediction in prediction_result:
        print(prediction['predictions'])
    

    指标:

    {'average_loss': 3.9024353e-06, 'loss': 3.9024353e-06, 'global_step': 1005555}

    {'average_loss': 3.9721594e-06, 'loss': 3.9721594e-06, 'global_step': 1005555}

    [1250.003]

    [2250.002]

    [7249.997]

    【讨论】:

    • 很好,你愿意解释一下为什么 epoch size 应该是 None 以及它为什么会有所不同?
    • 我多次调整超参数,因为我也被这个问题激怒了。并观察到当我增加步骤时损失逐渐减少。对计算的全面解释是@dmainz 的answer。尝试减少步骤并更改优化器进行实验。当我使用其他具有更高 batch_size 和 num_epocs 值的优化器时,它喷出了 nan,我试图剪裁渐变。没用。但最后这是有效的代码。
    • 如果您觉得其中任何一项有用,您可以接受或投票。
    【解决方案2】:

    这种缓慢收敛的原因(你需要训练 100 万步,这对于这样一个看似微不足道的问题来说很奇怪)是数据没有标准化。

    应用归一化,我可以训练模型以 420 步准确预测值(我选择 420,因为它是模因的数量)以获得这些预测:

    [1250.387]
    [2250.2046]
    [7249.2915]
    

    代码(在 TF 2.2 中完成):

    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    import tensorflow as tf
    from sklearn.model_selection import train_test_split
    
    x_data = np.random.randint(1000, 8000, 1000000)
    y_true = x_data + 250
    
    tf.get_logger().setLevel('ERROR')
    
    feat_cols = [tf.feature_column.numeric_column('size', shape=[1])]
    estimator = tf.estimator.LinearRegressor(feature_columns=feat_cols, optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
    
    def normalize(arr):
        return (arr - arr.mean()) / arr.std()
    
    x_data_norm = normalize(x_data)
    y_true_norm = normalize(y_true)
    
    x_train, x_eval, y_train, y_eval = train_test_split(x_data_norm, y_true_norm, test_size=0.3, random_state=101)
    
    # numpy_input_fn is for when you have the full dataset available in an array already and want a quick way to do batching/shuffling/repeating
    input_func = tf.compat.v1.estimator.inputs.numpy_input_fn({'size': x_train}, y_train, batch_size=1, num_epochs=None, shuffle=True)
    train_input_func = tf.compat.v1.estimator.inputs.numpy_input_fn({'size': x_train}, y_train, batch_size=1, num_epochs=None, shuffle=True)
    eval_input_func = tf.compat.v1.estimator.inputs.numpy_input_fn({'size': x_eval}, y_eval, batch_size=1, num_epochs=None, shuffle=True)
    
    estimator.train(input_fn=input_func, steps=420)
    
    train_metrics = estimator.evaluate(input_fn=train_input_func, steps=500)
    eval_metrics = estimator.evaluate(input_fn=eval_input_func, steps=500)
    
    print(train_metrics)
    print(eval_metrics)
    
    # brand_new_data = np.array([1000, 2000, 7000])
    # input_fn_predict = tf.compat.v1.estimator.inputs.numpy_input_fn({'size': brand_new_data}, num_epochs=1, shuffle=False)
    # prediction_result = estimator.predict(input_fn=input_fn_predict)
    # print(list(prediction_result))
    
    # predict w/ normalization
    d1 = 1000 
    d2 = 2000
    d3 = 7000
    brand_new_data = np.array([(d1 - x_data.mean()) / x_data.std(), (d2 - x_data.mean()) / x_data.std(), (d3 - x_data.mean()) / x_data.std()])
    
    input_fn_predict = tf.compat.v1.estimator.inputs.numpy_input_fn({'size': brand_new_data}, num_epochs=1, shuffle=False)
    prediction_result = estimator.predict(input_fn=input_fn_predict)
    
    for res in prediction_result:
      print(res['predictions'] * y_true.std() + y_true.mean())
    

    【讨论】:

      猜你喜欢
      • 2015-06-19
      • 2018-04-02
      • 2019-12-27
      • 2019-12-31
      • 2016-05-24
      • 1970-01-01
      • 1970-01-01
      • 2018-12-20
      • 1970-01-01
      相关资源
      最近更新 更多