【问题标题】:Tensorflow feed_dict not learningTensorFlow feed_dict 没有学习
【发布时间】:2017-03-02 11:21:29
【问题描述】:

从 tensorflow 的 MNIST tutorial 复制和粘贴代码效果很好,准确度达到了 92%,正如预期的那样。

当我以 CSV 格式读取 MNIST 数据并使用 pd.DataFrame.values 转换为 np 数组时,此过程会中断。我从中得到了~10%(不比随机更好)的准确度。

下面是代码(教程代码运行良好,我的CSV阅读器学习失败):

工作 MNIST 教程:

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)


x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

for i in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})


correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

不工作(读取 CSV 并提供 np 数组):

import pandas as pd
from sklearn.cross_validation import train_test_split
import numpy as np    

# read csv file
MNIST = pd.read_csv("/data.csv")

# pop label column and create training label array
train_label = MNIST.pop("label")

# converts from dataframe to np array
MNIST=MNIST.values

# convert train labels to one hots
train_labels = pd.get_dummies(train_label)
# make np array
train_labels = train_labels.values

x_train,x_test,y_train,y_test = train_test_split(MNIST,train_labels,test_size=0.2)
# we now have features (x_train) and y values, separated into test and train

# convert to dtype float 32
x_train,x_test,y_train,y_test = np.array(x_train,dtype='float32'), np.array(x_test,dtype='float32'),np.array(y_train,dtype='float32'),np.array(y_test,dtype='float32')



x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

def get_mini_batch(x,y):
    # choose 100 random row values
    rows=np.random.choice(x.shape[0], 100)
    # return arrays of 100 random rows (for features and labels)
    return x[rows], y[rows]

# train
for i in range(100):
    # get mini batch
    a,b=get_mini_batch(x_train,y_train)
    # run train step, feeding arrays of 100 rows each time
    sess.run(train_step, feed_dict={x: a, y_: b})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: x_test, y_: y_test}))

我们将不胜感激。 (CSV 文件here。)

【问题讨论】:

    标签: csv pandas numpy neural-network tensorflow


    【解决方案1】:

    您是否尝试训练它进行更多迭代?我看到原始代码正在训练超过 1000 次迭代

    for i in range(1000):
    

    而 csv 代码只训练 100 次迭代:

    for i in range(100):
    

    如果不是这个原因,如果您也可以共享您的 CSV 文件,那将会很有帮助,因为我们可以轻松地测试您的代码。

    编辑:

    我已经测试了您的代码,它似乎是由简单的cross_entropy 计算中的数值不稳定性引起的(请参阅此SO question)。将您的 cross_entropy 定义替换为以下行,您就可以解决问题:

    cross_entropy = tf.reduce_mean(tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(
        y, y_, name='xentropy')))
    

    通过还可视化返回的 cross_entropy,您将看到您的代码返回 NaN,而使用此代码您将获得实数...

    完整的工作代码,每次迭代也会打印出 cross_entropy:

    import pandas as pd
    from sklearn.cross_validation import train_test_split
    import numpy as np    
    
    # read csv file
    MNIST = pd.read_csv("data.csv")
    
    # pop label column and create training label array
    train_label = MNIST.pop("label")
    
    # converts from dataframe to np array
    MNIST=MNIST.values
    
    # convert train labels to one hots
    train_labels = pd.get_dummies(train_label)
    # make np array
    train_labels = train_labels.values
    
    x_train,x_test,y_train,y_test = train_test_split(MNIST,train_labels,test_size=0.2)
    # we now have features (x_train) and y values, separated into test and train
    
    # convert to dtype float 32
    x_train,x_test,y_train,y_test = np.array(x_train,dtype='float32'), np.array(x_test,dtype='float32'),np.array(y_train,dtype='float32'),np.array(y_test,dtype='float32')
    
    x = tf.placeholder(tf.float32, [None, 784])
    W = tf.Variable(tf.zeros([784, 10]))
    b = tf.Variable(tf.zeros([10]))
    y = tf.nn.softmax(tf.matmul(x, W) + b)
    y_ = tf.placeholder(tf.float32, [None, 10])
    print y.get_shape()
    print y_.get_shape()
    cross_entropy = tf.reduce_mean(tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(y, y_, name='xentropy')))
    train_step = tf.train.GradientDescentOptimizer(0.0001).minimize(cross_entropy)
    init = tf.initialize_all_variables()
    sess = tf.Session()
    sess.run(init)
    
    def get_mini_batch(x,y):
        # choose 100 random row values
        rows=np.random.choice(x.shape[0], 100)
        # return arrays of 100 random rows (for features and labels)
        return x[rows], y[rows]
    
    # train
    for i in range(1000):
        # get mini batch
        a,b=get_mini_batch(x_train,y_train)
        # run train step, feeding arrays of 100 rows each time
        _, cost =sess.run([train_step,cross_entropy], feed_dict={x: a, y_: b})
        print cost
    
    correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print(sess.run(accuracy, feed_dict={x: x_test, y_: y_test}))
    

    您仍然需要进一步优化学习率和#iterations,但使用此设置您应该已经获得了约 70% 的准确度。

    【讨论】:

    • CSV 链接已上传。而且,唉,没有...1000 次训练交互将使我的代码准确率达到 10%。
    • 当我运行它时,准确度会进一步下降。请问您使用这个新的 cross_entropy 的准确性如何?
    • 对不起,我的代码中的减号不正确,我现在的准确率是 73%,我会把我的完整代码放在正文中!请注意,您仍然可以使用学习率和迭代次数来提高准确性。
    【解决方案2】:

    我很确定批次不应该是 100 个随机行,而是应该是 100 个彼此接连出现的行,例如,0:99 和 100:199 将是您的前两个批次。试试这个代码的批次。检查此kernel 以在 TF 中从 csv 训练 Mnist

    epochs_completed = 0
    index_in_epoch = 0
    num_examples = train_images.shape[0]
    
    # serve data by batches
    def next_batch(batch_size):
    
        global train_images
        global train_labels
        global index_in_epoch
        global epochs_completed
    
        start = index_in_epoch
        index_in_epoch += batch_size
    
        # when all trainig data have been already used, it is reorder randomly    
        if index_in_epoch > num_examples:
            # finished epoch
            epochs_completed += 1
            # shuffle the data
            perm = np.arange(num_examples)
            np.random.shuffle(perm)
            train_images = train_images[perm]
            train_labels = train_labels[perm]
            # start next epoch
            start = 0
            index_in_epoch = batch_size
            assert batch_size <= num_examples
        end = index_in_epoch
        return train_images[start:end], train_labels[start:end]
    

    【讨论】:

      猜你喜欢
      • 2021-03-13
      • 2017-09-08
      • 2018-10-08
      • 1970-01-01
      • 2017-12-21
      • 2019-01-19
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多