使用 tensorflow 从头开始简单的 MLP答案

【问题标题】：simple MLP from scratch using tensorflow使用 tensorflow 从头开始简单的 MLP
【发布时间】：2020-10-22 17:03:50
【问题描述】：

我正在尝试从头开始在 tensorflow 中实现 MLP，并在 MNIST 数据集上对其进行测试。这是我的代码：

import tensorflow.compat.v1 as tf
from tensorflow.compat.v1.keras.losses import categorical_crossentropy
tf.disable_v2_behavior()

image_tensor = tf.placeholder(tf.float32 , shape=(None , 784))
label_tensor = tf.placeholder(tf.float32 , shape=(None , 10))

# Model architecture
# --> Layer 1
w1 = tf.Variable(tf.random_uniform([784 , 128])) # weights
b1 = tf.Variable(tf.zeros([128])) # bias
a1 = tf.matmul(image_tensor , w1) + b1
h1 = tf.nn.relu(a1)
# --> Layer 2
w2 = tf.Variable(tf.random_uniform([128 , 128]))
b2 = tf.zeros([128])
a2 = tf.matmul(h1 , w2) + b2
h2 = tf.nn.relu(a2)
# --> output layer
w3 = tf.Variable(tf.random_uniform([128 , 10]))
b3 = tf.zeros([10])
a3 = tf.matmul(h2 , w3) + b3
predicted_tensor = tf.nn.softmax(a3) 

loss = tf.reduce_mean(categorical_crossentropy(label_tensor , predicted_tensor))

opt = tf.train.GradientDescentOptimizer(0.01) 
training_step = opt.minimize(loss)

with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)   
    epochs = 50
    batch  = 100
    iterations = len(training_images) // batch

    for j in range(epochs):
        start = 0
        end = batch
        for i in range(iterations):
            image_batch = np.array(training_images[start : end])
            label_batch = np.array(training_labels[start : end])

            start = batch + 1
            end = start + batch
            _ , loss = sess.run(training_step  , feed_dict = {
                image_tensor : image_batch,
                label_tensor : label_batch
                })

但是当我尝试运行此代码时，我收到以下错误消息：

File "MNIST3.py", line 97, in <module>
    main()
  File "MNIST3.py", line 88, in main
    label_tensor : label_batch
TypeError: 'NoneType' object is not iterable

虽然当我尝试从 label_batch 打印前 10 个样本时：

print(training_labels[0 : 10])

这将是输出：

[[1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [1 0 0 0 0 0 0 0 0 0]]

当我尝试打印数据集的形状时：

print(training_images.shape)
print(training_labels.shape)

这是输出：

(10000, 784)
(10000, 10)

我在这里错过了什么？

【问题讨论】：

另外，这是一个主观的事情，但也要考虑探索 TF 2.0 的实现。我使用 TF 1.0 很长时间，所以我使用图形和 session.run 编写了许多复杂的模型，到目前为止，使用 TF 2.0 让我的生活更轻松。这也是最终所有支持的地方，因此花时间在这方面是一项更有利可图的技术。（但是了解图表肯定很有价值，因为您使用图表，例如在构建自定义 keras 层时）

标签： python tensorflow machine-learning mnist

【解决方案1】：

您误解了错误消息（Python 在这方面可能会产生误导，我们都曾多次陷入此类错误，比我们愿意承认的要多...）。即使它在您的错误中显示label_tensor : label_batch 行，它实际上是在谈论整个session.run() 调用。

您看到此错误的原因是您希望调用返回一个元组，但您只提供了一个由 TensorFlow 计算的张量。

sess.run(training_step, feed_dict=...) 将返回 None，因为操作 training_step 不应该返回任何内容，调用它您只需执行一个优化步骤。

要获得想要的结果，请将代码更改为：

_ , loss_result = sess.run([training_step, loss], 
                           feed_dict={
                               image_tensor : image_batch,
                               label_tensor : label_batch
                           })

这样 TensorFlow 将评估这两个操作，第一个将返回 None（正如您已经得到的那样），第二个将计算给定批次的损失函数的值。

（请注意，您必须重命名左侧的损失变量，因为如果不这样做，您将替换损失操作，下一次调用可能会引发异常或更糟，只会给出错误的结果）

【讨论】：