【问题标题】:ValueError: expected dense_22 to have shape (None, 37) but got array with shape (1000, 2)ValueError: 预期 dense_22 具有形状 (None, 37) 但得到的数组具有形状 (1000, 2)
【发布时间】:2021-01-06 21:27:57
【问题描述】:

我目前正在开发一个问答系统。我创建了一个综合数据集,其中包含答案中的多个单词。但是,答案不是给定上下文的范围。

最初,我计划使用基于深度学习的模型对其进行测试。但是我在构建模型时遇到了一些问题。 这就是我对数据进行矢量化的方式。

def vectorize(data, word2idx, story_maxlen, question_maxlen, answer_maxlen):
    """ Create the story and question vectors and the label """
    Xs, Xq, Y = [], [], []
    for story, question, answer in data:
        xs = [word2idx[word] for word in story]
        xq = [word2idx[word] for word in question]
        y = [word2idx[word] for word in answer]
        #y = np.zeros(len(word2idx) + 1)
        #y[word2idx[answer]] = 1
        Xs.append(xs)
        Xq.append(xq)
        Y.append(y)
    return (pad_sequences(Xs, maxlen=story_maxlen), 
            pad_sequences(Xq, maxlen=question_maxlen),
            pad_sequences(Y, maxlen=answer_maxlen))
            #np.array(Y))

下面是我如何创建模型。

    # story encoder. Output dim: (None, story_maxlen, EMBED_HIDDEN_SIZE)
story_encoder = Sequential()
story_encoder.add(Embedding(input_dim=vocab_size, 
                              output_dim=EMBED_HIDDEN_SIZE,
                              input_length=story_maxlen))
story_encoder.add(Dropout(0.3))

# question encoder. Output dim: (None, question_maxlen, EMBED_HIDDEN_SIZE)
question_encoder = Sequential()
question_encoder.add(Embedding(input_dim=vocab_size,
                               output_dim=EMBED_HIDDEN_SIZE,
                               input_length=question_maxlen))
question_encoder.add(Dropout(0.3))

# episodic memory (facts): story * question
# Output dim: (None, question_maxlen, story_maxlen)
facts_encoder = Sequential()

facts_encoder.add(Merge([story_encoder, question_encoder], 
                        mode="dot", dot_axes=[2, 2]))
facts_encoder.add(Permute((2, 1)))                        

## combine response and question vectors and do logistic regression
answer = Sequential()
answer.add(Merge([facts_encoder, question_encoder], 
                 mode="concat", concat_axis=-1))
answer.add(LSTM(LSTM_OUTPUT_SIZE, return_sequences=True))
answer.add(Dropout(0.3))
answer.add(Flatten())
answer.add(Dense(vocab_size,activation= "softmax"))


answer.compile(optimizer="rmsprop", loss="categorical_crossentropy",
               metrics=["accuracy"])

answer.fit([Xs_train, Xq_train], Y_train, 
           batch_size=BATCH_SIZE, nb_epoch=NBR_EPOCHS,
           validation_data=([Xs_test, Xq_test], Y_test))

这是模型的总结

   _________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
merge_46 (Merge)             (None, 5, 616)            0         
_________________________________________________________________
lstm_23 (LSTM)               (None, 5, 32)             83072     
_________________________________________________________________
dropout_69 (Dropout)         (None, 5, 32)             0         
_________________________________________________________________
flatten_9 (Flatten)          (None, 160)               0         
_________________________________________________________________
dense_22 (Dense)             (None, 37)                5957      
=================================================================
Total params: 93,765.0
Trainable params: 93,765.0
Non-trainable params: 0.0
_________________________________________________________________

它给出了以下错误。

ValueError: Error when checking model target: expected dense_22 to have shape (None, 37) but got array with shape (1000, 2)

我认为错误与Y_train,Y_test有关。我应该将它们编码为分类值,答案不是文本范围,而是连续的。我不知道该怎么做/怎么做。 我该如何解决?有什么想法吗?

编辑:

当我在损失中使用 sparse_categorical_crossentropy 和 Reshape(2,-1); answer.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
merge_94 (Merge)             (None, 5, 616)            0         
_________________________________________________________________
lstm_65 (LSTM)               (None, 5, 32)             83072     
_________________________________________________________________
dropout_139 (Dropout)        (None, 5, 32)             0         
_________________________________________________________________
reshape_22 (Reshape)         (None, 2, 80)             0         
_________________________________________________________________
dense_44 (Dense)             (None, 2, 37)             2997      
=================================================================
Total params: 90,805.0
Trainable params: 90,805.0
Non-trainable params: 0.0
_________________________________________________________________

编辑2: 修改后的模型

# story encoder. Output dim: (None, story_maxlen, EMBED_HIDDEN_SIZE)
story_encoder = Sequential()
story_encoder.add(Embedding(input_dim=vocab_size, 
                              output_dim=EMBED_HIDDEN_SIZE,
                              input_length=story_maxlen))
story_encoder.add(Dropout(0.3))

# question encoder. Output dim: (None, question_maxlen, EMBED_HIDDEN_SIZE)
question_encoder = Sequential()
question_encoder.add(Embedding(input_dim=vocab_size,
                               output_dim=EMBED_HIDDEN_SIZE,
                               input_length=question_maxlen))
question_encoder.add(Dropout(0.3))

# episodic memory (facts): story * question
# Output dim: (None, question_maxlen, story_maxlen)
facts_encoder = Sequential()

facts_encoder.add(Merge([story_encoder, question_encoder], 
                        mode="dot", dot_axes=[2, 2]))
facts_encoder.add(Permute((2, 1)))                        

## combine response and question vectors and do logistic regression
## combine response and question vectors and do logistic regression
answer = Sequential()
answer.add(Merge([facts_encoder, question_encoder], 
                 mode="concat", concat_axis=-1))
answer.add(LSTM(LSTM_OUTPUT_SIZE, return_sequences=True))
answer.add(Dropout(0.3))
#answer.add(Flatten())
answer.add(keras.layers.Reshape((2, -1)))    
answer.add(Dense(vocab_size,activation= "softmax"))

answer.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy",
               metrics=["accuracy"])

answer.fit([Xs_train, Xq_train], Y_train, 
           batch_size=BATCH_SIZE, nb_epoch=NBR_EPOCHS,
           validation_data=([Xs_test, Xq_test], Y_test))

它仍然给予

ValueError: Error when checking model target: expected dense_46 to have 3 dimensions, but got array with shape (1000, 2)

【问题讨论】:

    标签: keras lstm question-answering


    【解决方案1】:

    据我了解 - Y_train、Y_test 包含索引(不是单热向量)。如果是这样 - 将损失更改为 sparse_categorical_entropy:

    answer.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy",
                   metrics=["accuracy"])
    

    据我了解 - Y_train,Y_test 有一个序列维度。并且问题(5)的长度不等于答案(2)的长度。此维度由Flatten() 删除。尝试用Reshape()替换Flatten()

    # answer.add(Flatten())
    answer.add(tf.keras.layers.Reshape((2, -1)))    
    

    【讨论】:

    • 当我更改损失函数时,我得到 ValueError: Error when checks model target: expected dense_22 to have shape (None, 1) but got array with shape (1000, 2).测试有1000个样本,每个答案最多包含2个单词。 @安德烈
    • 谢谢@Andrey。一旦我删除 Flatten() 层,添加 Reshape(2,-1);它给出了 ValueError:检查模型目标时出错:预期的 dense_33 具有 3 个维度,但得到了形状为 (1000, 2) 的数组。 vocab_size= 37 story_maxlen= 552 question_maxlen= 5 answer_maxlen= 2 story--> (1000, 552) question--> (1000, 5) answer shape--> (1000, 2)
    • 我更新了问题并在这些修改后添加了 answer.summary() 。 @安德烈
    • @programming123 你把 loss 改成 sparse_categorical_entropy 了吗?
    猜你喜欢
    • 1970-01-01
    • 2018-07-16
    • 1970-01-01
    • 2018-09-16
    • 2017-01-13
    • 1970-01-01
    • 2019-08-21
    • 2020-04-05
    • 2018-09-18
    相关资源
    最近更新 更多