TensorFlow 线性回归给出“NaN”结果答案

【问题标题】：TensorFlow Linear Regression gives 'NaN' resultTensorFlow 线性回归给出“NaN”结果
【发布时间】：2017-12-16 07:40:39
【问题描述】：

我目前正在使用线性回归运行 TensorFlow 模型。但是，我不明白为什么，即使我将 learning_rate 从 0.01 降低到 0.001 并将训练迭代次数从 1000 增加到 50000，我仍然获得成本函数的“nan”结果以及两个系数。谁能帮我检测以下代码中的问题？

 from __future__ import print_function

    import tensorflow as tf
    import numpy
    import matplotlib.pyplot as plt
    import pandas as pd
    from sklearn.model_selection import train_test_split
    import random

    rng = numpy.random


    # Parameters
    learning_rate = 0.001 
    training_epochs = 20000 #number of iterations
    display_step = 400



    #read csv file
    datapath = [directory path]

    Ha_Noi = pd.read_csv(datapath+"HaNoi_1month_LW_WeatherTest.csv")
    #Add an additional column into the table
    sLength = len(Ha_Noi['accept_rate'])
    Ha_Noi['accept_rate_timeT'] = pd.Series(Ha_Noi['accept_rate'], index=Ha_Noi.index)
    #Shift the entries in the accept_rate column upward
    Ha_Noi.accept_rate = Ha_Noi.accept_rate.shift(-1)

    Ha_Noi = Ha_Noi.dropna(subset = ["longwait_percent4"])
    Ha_Noi = Ha_Noi.dropna(subset=["accept_rate"])
    Ha_Noi = Ha_Noi.dropna(subset = ["longwait_percent2"])
    df2 = pd.DataFrame(Ha_Noi)

    #split the dataset into training and testing sets
    train_set, test_set = train_test_split(Ha_Noi, test_size=0.2, random_state = random.randint(20, 200))
    Xtrain = train_set['longwait_percent2'].reshape(-1,1)
    Ytrain = train_set['accept_rate'].reshape(-1,1)

    Xtrain2 = train_set['Weather Weight_Longwait_percent2'].reshape(-1,1)
    Xtest2 = test_set['Weather Weight_Longwait_percent2'].reshape(-1,1)

    # Xtest = test_set['longwait_percent2'].reshape(-1,1)
    # Ytest = test_set['accept_rate'].reshape(-1,1)

    # Training Data
    train_X = Xtrain
    train_Y = Ytrain
    n_samples = train_X.shape[0]

    #Testing Data
    Xtest = np.asarray(test_set['longwait_percent2'])
    Ytest = np.asarray(test_set['accept_rate'])

    # tf Graph Input
    X = tf.placeholder("float")
    Y = tf.placeholder("float")

    # Set model weights
    W = tf.Variable(rng.randn(), name="weight")
    b = tf.Variable(rng.randn(), name="bias")

    # Construct a linear model
    pred = tf.add(tf.multiply(X, W), b)

    # Mean squared error
    cost = tf.sqrt(tf.reduce_sum(tf.pow(pred-Y, 2))/(n_samples))

    # Gradient descent method
    #  Note, minimize() knows to modify W and b because Variable objects are "trained" (trainable=True by default)
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

    # Initializing the variables
    init = tf.global_variables_initializer()
    saver = tf.train.Saver() #save all the initialized data

    # Launch the graph
    with tf.Session() as sess:
        sess.run(init)

        # Fit all training data
        for epoch in range(training_epochs):
            for (x, y) in zip(train_X, train_Y):
                sess.run(optimizer, feed_dict={X: x, Y: y})

            # Display logs per epoch step
            if (epoch+1) % display_step == 0: # checkpoint every 50 epochs
                c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
                print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
                    "W=", sess.run(W), "b=", sess.run(b))

        print("Optimization Finished!")
        training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
        print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

        # Graphic display
        plt.plot(train_X, train_Y, 'ro', label='Original data')
        plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
        plt.legend()
        plt.show()

        testing_cost = sess.run(
            tf.reduce_sum(tf.pow(pred - Y, 2)) / (Xtest.shape[0]),
            feed_dict={X: Xtest, Y: Ytest})  # square root of function cost above
        print("Root Mean Square Error =", tf.sqrt(testing_cost))
        print("Absolute mean square loss difference:", abs(
            training_cost - testing_cost))

        plt.plot(Xtest, Ytest, 'bo', label='Testing data')
        plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
        plt.legend()
        plt.show()

【问题讨论】：

标签： tensorflow linear-regression

【解决方案1】：

没有您的数据，因此很难判断问题是由数据引起的还是由训练问题引起的。你可以让学习率和训练迭代小得多，比如 0.00005 和 100，看看是否还有 NaN。

【讨论】：

奇怪的是，当我将因子 2 放回成本函数的分母中时，我仍然得到了 nan。假设数据没有问题，如果上面的代码有问题，你能帮忙检查一下吗？我还想计算 RMSE，我想我只需要将 tf.sqrt() 放在这个公式之外： tf.reduce_sum(tf.pow(pred - Y, 2)) / (Xtest.shape[0]) .这是正确的吗？
我能够运行上面的代码并获得结果。但是，这种情况下的 RMSE 比 Skit-Learn 线性回归模型要差很多。这是否意味着我的代码错误？我发现很难相信像上面这样强大的方法（应用梯度下降）可以给出一个两倍于 Skit-Learn 的线性回归（）的 RMSE。谁能帮忙解释一下？