如何使用从 LSTM 循环神经网络中提取的权重答案

【问题标题】：How to use weigths extracted from a LSTM recurrent neural network如何使用从 LSTM 循环神经网络中提取的权重
【发布时间】：2018-05-12 05:57:33
【问题描述】：

我已经用 Python 中的 Keras 训练了一个用于序列（时间序列）分类的 LSTM 循环神经网络。

特征被整理成一个形状（batch_size、timesteps、data_dim）。我的训练样本总共有 1000 个。最终目标是在 5 个类别中进行分类。这是我的代码的 sn-p。

#defining some model features
data_dim = 15
timesteps = 20
num_classes = len(one_hot_train_labels[1,:])
batch_size = len(ytrain) 

#reshaping array for LSTM training
xtrain=numpy.reshape(xtrain, (len(ytrain), timesteps, data_dim))
xtest=numpy.reshape(xtest, (len(ytest), timesteps, data_dim))

rms = optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0) #It is recommended to leave the parameters
#of this optimizer at their default values (except the learning rate, which can be freely tuned).

# create the model
model = Sequential()
model.add(LSTM(101, dropout=0.5, recurrent_dropout=0.5, input_shape=(timesteps, data_dim), activation='tanh'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=rms, metrics=['accuracy'])
print(model.summary())
history = model.fit(xtrain, one_hot_train_labels, epochs=200, batch_size=10)
# Final evaluation of the model
scores = model.evaluate(xtrain, one_hot_train_labels, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
scores = model.evaluate(xtest, one_hot_test_labels, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

由于我想在其他地方使用和实现分类器，因此我使用以下方法提取了权重：

weights = [model.get_weights() for layer in model.layers]

过去使用过传统的神经网络和逻辑回归，我期望每一层有 2 个矩阵，一个带有多项式权重，一个带有偏置单元，然后使用激活函数（在本例中为 tanh 和 softmax 函数) 逐步找到属于 5 个类别之一的概率。

但我现在很困惑，因为调用权重会返回 5 个具有以下大小的矩阵：

(15, 400)
(100, 400)
(400,)
(100,5)
(5,)

现在，我了解 LSTM 可以使用 4 个不同的块：

来自向量的输入
上一个块的内存
当前块的内存
上一个区块的输出

以及为什么我的矩阵的 2n 维的大小为 400。

现在我的问题是：

如何使用激活函数（如在传统神经网络中）以级联方式使用这些权重来最终获得类概率？

输入层的偏置单元在哪里？

感谢大家可以帮助澄清和帮助理解如何使用这个强大的工具作为 LSTM 网络。

希望这不仅对我有帮助。

【问题讨论】：

上！有人帮忙吗？

标签： python tensorflow machine-learning neural-network keras

【解决方案1】：

当你说得到类概率时，我猜你想要类预测（？）你可以在训练网络后使用model.predict() 来获取类概率。当你想预测时，最好是model.save_weights(filename)，然后是model.load_weights(filename)。输入层没有偏差，您可以通过model.summary()查看您的层有多少参数

【讨论】：