解释 Keras 模型预测输出答案

【问题标题】：Interpreting Keras Model Predict Output解释 Keras 模型预测输出
【发布时间】：2021-07-10 13:01:44
【问题描述】：

我已经适合 mnist 数字 Keras/TF 示例。

digits_mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128,activation='relu'),
  tf.keras.layers.Dense(10)
])
model.compile(
    optimizer=tf.keras.optimizers.Adam(0.001),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],
)

model.fit(
    x=train_images,
    y=train_labels,
    epochs=6,
    validation_data=(test_images, test_labels),
)

稀疏分类准确率高达94.5%左右

此时，我将通过模型运行其中一个训练示例，以查看输出的样子。我相信你会使用预测功能来做到这一点。我不得不对训练示例数据进行一些重塑（这可能是我在这里遇到的问题，还有其他帖子但没有结论）

我认为结果是合理的

image_in = train_images[0][ np.newaxis, :, : ] # reshape
predict = model.predict(image_in)
print(predict, train_labels[0])

image_in2 = train_images[1][ np.newaxis, :, : ] # reshape
predict = model.predict(image_in2)
print(predict, train_labels[1])

image_in3 = train_images[2][ np.newaxis, :, : ] # reshape
predict = model.predict(image_in3)
print(predict, train_labels[2])

image_in4 = train_images[3][ np.newaxis, :, : ] # reshape
predict = model.predict(image_in4)
print(predict, train_labels[3])

[[-15.103473 20.778965 -9.244939 62.400173 -23.793236
72.29711 -2.7528331 12.732147 37.075775 36.81269 ]] 5

[[ -1.3534731 -24.39009 -14.5208435 -20.452188 -16.758095 -12.028614 -13.0093565 -9.06416 -11.541512 -14.997495 ]] 0

[[-9.685611 18.384281 13.8173685 -0.23191524 37.27173
18.273088 -1.4883347 26.91457 11.042679 25.099646 ]] 4

[[ 11.550052 37.031742 -0.43448153 2.1549647 6.6804423 1.829277 11.534891 4.703198 1.562077 -14.293095 ]] 1

标签和包含最大数字的输出的索引之间存在映射。

所以我决定对我画的数字进行一些测试。

所以看起来 MNIST 在黑色背景上是白色的，所以我在加载图像时做了一点变换

image_file = Image.open('mysix.png')
image_file = ImageOps.grayscale(image_file)
mysix = np.invert(np.array(image_file))
image_in = mysix[ np.newaxis, :, : ] # reshape
predict = model.predict(image_in)
print(predict)
cv2.imwrite("real_test.png", mysix)

输出没有那么令人信服

这对于 6 [不正确]

[[-11.062315 -3.6117797 -12.970709 -3.692216 -20.52597
6.8898406 -6.7844076 -4.1480203 -8.589685 -8.556881 ]]

这是给三个[正确]

[[-30.695564 -23.397968 -21.212194 24.455023 -31.399946
10.118337 -82.92692 -10.150092 -5.8821173 -12.108372 ]]

如果为七个 [不正确]

[[ 1.2403618 4.0243044 9.859227 9.83745 -6.681723 2.4680052
-7.4165597 6.6975245 3.355576 -9.518949 ]]

我重塑数据以使用经过训练的模型对其进行评估的方式是否正确？
我在代码中为加载灰度 PNG 所做的所有数据处理都是合法的吗？
如果 1 和 2 都为真，那么对于在我的第 6 个训练时期结束时，在 mnist 评估集上以 95% 剪辑有效但在我的（尽管有限）评估集？

【问题讨论】：

简单，您的图像与 MNIST 数据集相差太大。
不是数据集的重点是“所有”图像的准确子集/表示吗？我认为这就是这类算法的强大之处。您可以很好地表示所有图像，并且算法会根据模式进行分类，因此它可以在训练子集之外进行分类
不，MNIST 数据集从来没有这样的意图，它只有 60K 图像，它是一个学术数据集。
同样使用简单的人工神经网络，您只是匹配原始像素，而不是学习高级模式。

标签： python tensorflow keras neural-network data-science

【解决方案1】：

我需要卷积神经网络，类似于 41 分钟的 MIT 视频中所展示的内容 https://www.youtube.com/watch?v=AjtX1N_VT9E（解决上述 Frightera 提到的问题）

model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28,28,1), data_format="channels_last"),
  tf.keras.layers.MaxPooling2D(2,2),
  tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2,2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(1024,activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])

我可能需要做更多的评估，但是在超过 6 个 epoch 之后，网络达到了 99.63% 的稀疏分类准确率，明显好于之前的实现也正确分类了我的三个手绘数字。

【讨论】：