如何通过 keras LSTM 层传递这个模拟单热编码数据？答案

【问题标题】：How to pass this mock one-hot-encoded data through keras LSTM layer?如何通过 keras LSTM 层传递这个模拟单热编码数据？
【发布时间】：2020-02-03 13:39:02
【问题描述】：

正如（我认为）我在 Keras 中所理解的那样，LSTM 层期望输入数据具有 3 维：(batch_size, timesteps, input_dim)。

但是，当涉及到我的数据时，我真的很难理解这些值实际对应的内容。我希望如果有人能解释我如何将以下模拟数据（与我的实际数据集具有相似结构）输入到 LSTM 层，那么我可能会理解如何使用我的真实数据集实现这一点。

因此，示例数据是使用 one-hot-vector encoding 编码的分类数据序列。例如，前 3 个样本如下所示：

[ [0, 0, 0, 1], [0, 0, 1, 0], [1, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0] ]

[ [0, 1, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 0] ]

[ [0, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1] ]

即序列的长度为 5，有 4 个分类选项可以位于序列中的某个位置。假设我有 3000 个序列。这是一个二元分类问题。

所以我相信这会形成我的数据集(3000, 5, 4) 的形状？

我要使用的模型如下所示：

model = keras.Sequential([
    keras.layers.LSTM(units=3, batch_input_shape=(???)),
    keras.layers.Dense(128, activation='tanh'),
    keras.layers.Dense(64, activation='tanh'),
    keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=20)

这暂时忽略了任何训练/测试拆分，因此假设我正在使用整个数据集进行训练。我正在努力的部分是input_shape。

我希望序列中的每个元素都是一个时间步长。我尝试了很多不同的形状并得到了很多不同的错误。我猜我实际上需要重塑x_train 而不仅仅是调整input_shape。问题是我不知道它实际上需要什么形状。

我想我理解 LSTM 背后的理论，我只是在努力理解维度要求的实用性。

我们将不胜感激任何帮助或建议。谢谢。

编辑 - 正如@scign 所建议的那样。这是我在使用以下模拟数据集代码时遇到的错误示例：

x_train = [[0, 0, 0, 1], [0, 0, 1, 0], [
    1, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0]], [[0, 1, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]

y_train = [1, 0, 1]

model = keras.Sequential([
    keras.layers.LSTM(units=3, batch_input_shape=(1, 5, 4)),
    keras.layers.Dense(128, activation='tanh'),
    keras.layers.Dense(64, activation='tanh'),
    keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=20)

错误 - ValueError: Error when checking input: expected lstm_input to have 3 dimensions, but got array with shape (5, 4)

【问题讨论】：

对于某些示例，batch_input_shape 应为 (batch_size, timesteps, data_dim) keras.io/getting-started/sequential-model-guide
是的，我知道这是一个 2d 数组 @AKSW，我的问题是如何使它成为适合 LSTM 层输入的 3d 数组？但显然不只是任何 3D 数组，它需要在 3 个维度中的每个维度中都具有正确的大小，而这正是我努力解决的问题。
嗯，把它包在另一对 [] 周围我会说吗？

标签： python machine-learning keras lstm one-hot-encoding

【解决方案1】：

（我认为）我在 Keras 中理解，LSTM 层期望输入数据具有 3 维：（batch_size、timesteps、input_dim）。

正确。

即序列的长度为 5，有 4 个分类选项可以位于序列中的某个位置。假设我有 3000 个序列。这是一个二元分类问题。

所以我相信这会形成我的数据集的形状 (3000, 5, 4)？

正确。

您是否仅限于 tensorflow 1.x？ 2.x 版本已经发布了一段时间，并且 keras 已经与 tf2 完全集成，所以除非您有一些限制，否则您可能需要考虑使用 tf2。

编辑：查看您的训练数据：

您需要在数据周围添加一组额外的方括号
您的数据需要位于单个 numpy 数组中，而不是列表列表中
您的数据元素必须是浮点数而不是整数

此外，您可以使用input_dim 参数并仅指定特征数量，而不是使用batch_input_shape。

以下对我有用。

# make tensorflow a bit quieter
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf
import numpy as np

x_train = np.array([
    [[0, 0, 0, 1], [0, 0, 1, 0], [1, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0]],
    [[0, 1, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 0]],
    [[0, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]
], dtype=float)
y_train = np.array([1, 0, 1], dtype=float)

features = x_train.shape[2]

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(units=3, input_dim=features))
model.add(tf.keras.layers.Dense(128, activation='tanh'))
model.add(tf.keras.layers.Dense(64, activation='tanh'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(optimizer='adam',
            loss='binary_crossentropy',
            metrics=['accuracy'])

model.fit(x_train, y_train, epochs=20)

print(model.predict(x_train))

输出

>>>python so60040420.py 训练 3 个样本纪元 1/20 WARNING:tensorflow:From C:\...\lib\site-packages\tensorflow_core\python\ops\nn_impl.py:183: where (from tensorflow.python.ops.array_ops) 已弃用并将在未来删除版本。更新说明：使用 2.0 中的 tf.where，其广播规则与 np.where 相同 WARNING:tensorflow:Entity .initialize_variables at 0x0000019E75189598> 无法转换，将按原样执行。请将此报告给 AutoGraph 团队。提交错误时，将详细程度设置为 10（在 Linux 上，`export AUTOGRAPH_VERBOSITY=10`）并附加完整的输出。原因：模块“gast”没有属性“Num” 3/3 [===============================] - 1s 464ms/样本 - 损失：0.7105 - 准确度：0.0000e+ 00 纪元 2/20 3/3 [==============================] - 0s 1ms/样本 - 损失：0.6842 - 准确度：0.6667 时代 3/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.6591 - 准确度：0.6667 时代 4/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.6350 - 准确度：0.6667 纪元 5/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.6119 - 准确度：0.6667 时代 6/20 3/3 [==============================] - 0s 1ms/样本 - 损失：0.5897 - 准确度：0.6667 时代 7/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.5684 - 准确度：0.6667 时代 8/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.5479 - 准确度：0.6667 纪元 9/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.5282 - 准确度：0.6667 时代 10/20 3/3 [==============================] - 0s 1ms/样本 - 损失：0.5092 - 准确度：0.6667 时代 11/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.4909 - 准确度：0.6667 时代 12/20 3/3 [===============================] - 0s 1ms/样本 - 损失：0.4733 - 准确度：0.6667 时代 13/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.4564 - 准确度：0.6667 时代 14/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.4402 - 准确度：0.6667 时代 15/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.4246 - 准确度：0.6667 时代 16/20 3/3 [===============================] - 0s 1ms/样本 - 损失：0.4096 - 准确度：0.6667 时代 17/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.3951 - 准确度：0.6667 时代 18/20 3/3 [===============================] - 0s 2ms/样本 - 损失：0.3809 - 准确度：0.6667 时代 19/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.3670 - 准确度：1.0000 时代 20/20 3/3 [==============================] - 0s 2ms/样本 - 损失：0.3531 - 准确度：1.0000 [[0.8538592] [0.48295668] [0.8184752]]

【讨论】：

谢谢你。正如您所建议的，我添加了一个错误示例。关于 tf 版本，一旦我了解了这个问题的解决方案，我会先研究它，否则我可能会有点不知所措，哈哈。