【问题标题】:InvalidArgumentError using stacked LSTMs in keras在 keras 中使用堆叠 LSTM 的 InvalidArgumentError
【发布时间】:2017-06-23 23:58:43
【问题描述】:

我正在使用一个数据集,其中批处理项是由具有形状(max_sentences_per_text,max_tokens_per_sentence)的矩阵表示的文本。它经过一个嵌入层(变成 3d),然后是一个时间分布的 LSTM,它为每个句子输出一个向量(回到 2d)。然后,第二个 LSTM 层读取所有句子向量,并为每个批次项输出一个最终向量,该向量可以通过正常的密集层。

如下图所示(使用keras.utils.plot_model 生成),每个文本有 85 个句子,每个句子有 40 个标记:

这是型号代码:

inputs = Input([num_sentences, max_sentence_size])

vocab_size, embedding_size = embeddings.shape
init = initializers.constant(embeddings)
emb_layer = Embedding(vocab_size, embedding_size, mask_zero=True,
                      embeddings_initializer=init)
emb_layer.trainable = False
embedded = emb_layer(inputs)

projection_layer = Dense(lstm1_units, activation=None, use_bias=False,
                         name='projection')
projected = projection_layer(embedded)

lstm1 = LSTM(lstm1_units, name='token_lstm')
sentence_vectors = TimeDistributed(lstm1)(projected)

lstm2 = LSTM(lstm2_units, name='sentence_lstm')
final_vector = lstm2(sentence_vectors)

hidden = Dense(hidden_units, activation='relu', name='hidden')(final_vector)
scores = Dense(num_scores, activation='sigmoid', name='scorer')(hidden)

model = keras.models.Model(inputs, scores)

这对我来说看起来不错,除了我遇到以下错误:

Traceback (most recent call last):
  File "src/network.py", line 43, in <module>
    network.fit(x, y, validation_data=(xval, yval))
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 1507, in fit
    initial_epoch=initial_epoch)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 1156, in _fit_loop
    outs = f(ins_batch)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2269, in __call__
    **self.session_kwargs)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Inputs to operation sentence_lstm/while/Select_2 of type Select must have the same size and shape.  Input 0: [32,4000] != input 1: [32,100]
     [[Node: sentence_lstm/while/Select_2 = Select[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](sentence_lstm/while/Tile, sentence_lstm/while/add_5, sentence_lstm/while/Identity_3)]]


Caused by op u'sentence_lstm/while/Select_2', defined at:
  File "src/network.py", line 37, in <module>
    args.hidden_units)
  File "src/model.py", line 51, in create_model
    final_vector = lstm2(sentence_vectors)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/layers/recurrent.py", line 262, in __call__
    return super(Recurrent, self).__call__(inputs, **kwargs)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 596, in __call__
    output = self.call(inputs, **kwargs)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/layers/recurrent.py", line 341, in call
    input_length=input_shape[1])
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2538, in rnn
    swap_memory=True)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2605, in while_loop
    result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2438, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2388, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2509, in _step
    new_states = [tf.where(tiled_mask_t, new_states[i], states[i]) for i in range(len(states))]
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 2301, in where
    return gen_math_ops._select(condition=condition, t=x, e=y, name=name)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2386, in _select
    name=name)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/erick/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Inputs to operation sentence_lstm/while/Select_2 of type Select must have the same size and shape.  Input 0: [32,4000] != input 1: [32,100]
     [[Node: sentence_lstm/while/Select_2 = Select[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](sentence_lstm/while/Tile, sentence_lstm/while/add_5, sentence_lstm/while/Identity_3)]]

培训电话是network.fit(x, y, validation_data=(xval, yval)),具有以下形状:

In [89]: x.shape
Out[89]: (1000, 85, 40)

In [90]: y.shape
Out[90]: (1000, 5)

In [91]: xval.shape
Out[91]: (500, 85, 40)

In [92]: yval.shape
Out[92]: (500, 5)

【问题讨论】:

  • 看起来你的输入大小有问题。您使用什么作为 model.fit 的输入?这就是引发错误的地方。
  • 我认为问题不在于输入。我用更多信息编辑了问题,表明问题是在调用第二个 LSTM 层时引起的。

标签: keras recurrent-neural-network


【解决方案1】:

从问题中移出:

更新:经过大量搜索,我发现问题在于 TimeDistributed 不适用于屏蔽。我可以使用 TimeDistributed(emb_layer)(inputs) 使模型运行包装嵌入层调用,但这会禁用整个模型的屏蔽。

这是 Keras 的一个已知问题,但仍然没有解决方案:

https://github.com/fchollet/keras/issues/4786 https://github.com/fchollet/keras/issues/3030

【讨论】:

    【解决方案2】:

    好的,我想我找到了错误。

    final_vector = lstm2(sentence_vectors)
    

    应该是

    final vector = (lstm2)(sentence_vectors)
    

    否则,您将 lstm2 作为函数调用,sentence_vectors 作为参数。

    【讨论】:

    • Python 中单个变量的括号实际上并没有改变任何东西。它应该是一个函数调用,因为它是功能性的 Keras API。每个 Layer 对象都实现了call() 方法。
    猜你喜欢
    • 2018-07-08
    • 2017-03-12
    • 1970-01-01
    • 2018-04-29
    • 1970-01-01
    • 2021-07-26
    • 2023-03-17
    • 1970-01-01
    • 2019-02-14
    相关资源
    最近更新 更多