在 Tensorflow 中构建 LSTM RNN 时尺寸不匹配答案

【问题标题】：Dimension mismatch while building LSTM RNN in Tensorflow在 Tensorflow 中构建 LSTM RNN 时尺寸不匹配
【发布时间】：2018-09-18 13:50:00
【问题描述】：

我正在尝试在 Tensorflow 中构建多层、多类、多标签的 LSTM。我一直在尝试将this 教程弯曲到我的数据中。

但是，我收到一条错误消息，提示我在构建 RNN 时尺寸不匹配。

ValueError: 尺寸必须相等，但对于 'rnn/while/rnn/multi_rnn_cell/cell_0/lstm_cell/MatMul_1' (op: 'MatMul') 输入形状为 1000 和 923: [?,1000] , [923,2000]。

我无法确定构建架构中哪个变量不正确：

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)


def bias_variable(shape):
    initial = tf.constant(0.0, shape=shape)
    return tf.Variable(initial)


def lstm(x, weight, bias, n_steps, n_classes):

    cell = rnn_cell.LSTMCell(cfg.n_hidden_cells_in_layer, state_is_tuple=True)
    multi_layer_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)

    # FIXME : ERROR binding x to LSTM as it is
    output, state = tf.nn.dynamic_rnn(multi_layer_cell, x, dtype=tf.float32)
    # FIXME : ERROR

    output_flattened = tf.reshape(output, [-1, cfg.n_hidden_cells_in_layer])
    output_logits = tf.add(tf.matmul(output_flattened, weight), bias)

    output_all = tf.nn.sigmoid(output_logits)
    output_reshaped = tf.reshape(output_all, [-1, n_steps, n_classes])

    # ??? switch batch size with sequence size. ???
    # then gather last time step values
    output_last = tf.gather(tf.transpose(output_reshaped, [1, 0, 2]), n_steps - 1)


    return output_last, output_all

这些是我的占位符、损失函数和所有爵士乐：

x_test, y_test = load_multiple_vector_files(test_filepaths)
x_valid, y_valid = load_multiple_vector_files(valid_filepaths)

n_input, n_steps, n_classes = get_input_target_lengths(check_print=False)


# FIXME n_input should be the problem
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
y_steps = tf.placeholder("float", [None, n_classes])

weight = weight_variable([cfg.n_hidden_layers, n_classes])
bias = bias_variable([n_classes])
y_last, y_all = lstm(x, weight, bias, n_steps, n_classes)

#all_steps_cost=tf.reduce_mean(-tf.reduce_mean((y_steps * tf.log(y_all))+(1 - y_steps) * tf.log(1 - y_all),reduction_indices=1))
all_steps_cost = -tf.reduce_mean((y_steps * tf.log(y_all)) + (1 - y_steps) * tf.log(1 - y_all))
last_step_cost = -tf.reduce_mean((y * tf.log(y_last)) + ((1 - y) * tf.log(1 - y_last)))
loss_function = (cfg.alpha * all_steps_cost) + ((1 - cfg.alpha) * last_step_cost)

optimizer = tf.train.AdamOptimizer(learning_rate=cfg.learning_rate).minimize(loss_function)

我很确定是我的 X 占位符 导致了问题，导致层和它们的矩阵尺寸不匹配。链接示例使用的常量很难看出它的实际含义。

有人可以帮我吗？ :)

更新： 我对不匹配的尺寸做了一个“有根据的猜测”。一个是 2*hidden_width，所以隐藏了新输入 + 其旧的循环输入。然而，不匹配的维度是 input_width + hidden_width，就像它试图将隐藏层的宽度设置为输入层的重复性一样。

【问题讨论】：

标签： tensorflow machine-learning neural-network lstm

【解决方案1】：

我发现我错误地设置了权重变量，使用常数 n_hidden_layers（隐藏层数）而不是 n_hidden_cells_in_layer（层数）。

【讨论】：