用于时间序列预测的 Keras LSTM：预测特征向量答案

【问题标题】：Keras LSTM for timeseries prediction: predicting vectors of features用于时间序列预测的 Keras LSTM：预测特征向量
【发布时间】：2017-10-08 14:30:52
【问题描述】：

我有一个包含 N 个观测值和 F 个特征的时间序列数据集。每个功能都可以显示 (1) 或不显示 (0)。所以数据集看起来像这样：

T    F1    F2    F3    F4    F5 ... F
0    1     0     0     1     0      0
1    0     1     0     0     1      1
2    0     0     0     1     1      0
3    1     1     1     1     0      0
...
N    1     1     0     1     0      0

我正在尝试使用基于 LSTM 的架构来根据观察 T-W - T 预测在时间 T+1 出现哪些特征，其中 W 是某个时间窗口的宽度。如果 W=4，LSTM 会“看到”过去的 4 个时间步以进行预测。 LSTM 需要 3D 输入，其形式为 (number_batches, W, F)。一个简单的 Keras 实现可能如下所示：

model = Sequential()
model.add(LSTM(128, stateful=True, batch_input_shape=(batch_size, W, F)))
model.add(Dense(F, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])
model.fit(x_train, y_train,
          batch_size=batch_size, epochs=250, shuffle=False,
          validation_data=(x_val, y_val))

我遇到的主要问题是：完整的数据集有大量的特征（> 200），特征表现出来的情况相对较少，即 0 比 1 更常见。神经网络只是学习将所有值设置为 0，从而实现高度的“准确性”。

本质上，我想对输入矩阵中的每个 1 加权某个值以赋予它更多的重要性，但我很困惑如何在 Keras 中实现这一点。我知道 Keras 中有一个选项 sample_weight，但它是如何工作的？例如，我不知道如何在我的示例中实现它。这是解决我遇到的问题的合理方法吗？这类问题通常使用哪些优化器和损失函数？

【问题讨论】：

您是否尝试使用 0 = -1, 1 = 1 来代替？
旁注：您确定要在您的情况下使用 stateful=True 吗？请参阅philipperemy.github.io/keras-stateful-lstm，了解在这种情况下必须如何准备训练数据。

标签： python time-series keras lstm rnn

【解决方案1】：

这是我用于 2D 高度不平衡数据的损失函数，效果很好。您可以替换binary_crossentropy 以获得另一种损失。

import keras.backend as K

def weightedByBatch(yTrue,yPred):

    nVec = K.ones_like(yTrue) #to sum the total number of elements in the tensor
    percent = K.sum(yTrue) / K.sum(nVec) #percent of ones relative to total
    percent2 = 1 - percent #percent of zeros relative to total   
    yTrue2 = 1 - yTrue #complement of yTrue (yTrue+ yTrue2 = full of ones)   

    weights = (yTrue2 * percent2) + (yTrue*percent)
    return K.mean(K.binary_crossentropy(yTrue,yPred)/weights)

对于您的 3D 数据，这可能有效，但也许您可以在列中工作，为每个特征创建一对权重，而不是将所有特征加在一起。

这将是这样完成的：

def weightedByBatch2D(yTrue,yPred):

    nVec = K.ones_like(yTrue) #to sum the total number of elements in the tensor
    percent = K.sum(K.sum(yTrue,axis=0,keepdims=True),axis=1,keepdims=True) / K.sum(K.sum(nVec,axis=0,keepdims=True),axis=1,keepdims=True) #percent of ones relative to total
    percent2 = 1 - percent #percent of zeros relative to total   
    yTrue2 = 1 - yTrue #complement of yTrue (yTrue+ yTrue2 = full of ones)   

    weights = (yTrue2 * percent2) + (yTrue*percent)
    return K.mean(K.binary_crossentropy(yTrue,yPred)/weights)

【讨论】：

谢谢！这看起来像我需要的。