【发布时间】:2022-01-13 11:22:42
【问题描述】:
我有以下基础网络,其中包含一些重要的(由于这些原因会出现错误)参数(请假设所有其他参数)
maxlen = 250
model_dense = 256
基础模型:
def build_base_model(inputs):
inputs = layers.Input(shape=(maxlen,),name='base_input')
embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)
x = embedding_layer(inputs)
transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim, trans_drop1, trans_drop2, trans_reg1, trans_reg2)
x = transformer_block(x)
x = layers.GlobalAveragePooling1D()(x)
x = layers.Dropout(model_drop1)(x)
outputs = layers.Dense(model_dense)(x)
base_model = keras.Model(inputs=inputs, outputs=outputs)
return base_model
我的Siamese 网络为:
base_model = build_base_model()
input_text1 = layers.Input(shape=(maxlen,))
input_text2 = layers.Input(shape=(maxlen,))
emb1 = base_model(input_text1)
emb2 = base_model(input_text2)
distance = layers.Lambda(euclidean_distance)([emb1, emb2])
outputs = layers.Dense(1, activation="sigmoid")(distance)
model = keras.Model(inputs=[emb1, emb2], outputs=outputs)
model.compile(
optimizer="adam", metrics = ["accuracy",], loss= 'binary_crossentropy')
history = model.fit(
train_X, train_y, batch_size=batch_size, epochs = 50, validation_split = 0.15, callbacks = callbacks, verbose = 1,
)
它给了我一个错误:
ValueError: Input 0 of layer "model_11" is incompatible with the layer: expected shape=(None, 256), found shape=(None, 250)
我做错了什么?
- Base Transformer model tutorial taken from this
- Siamese Model Structure, cosine distance, make_pairs from this
更新- 我以不同的方式构建了新网络,并且它已启动并运行。有人可以确认它是否正确:
inputs1 = layers.Input(shape=(maxlen,),name='inp_1')
inputs2 = layers.Input(shape=(maxlen,),name='inp_2')
embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)
transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim, trans_drop1, trans_drop2, trans_reg1, trans_reg2)
pooling = layers.GlobalAveragePooling1D()
drop_layer = layers.Dropout(model_drop1)
out_dense = layers.Dense(model_dense)
x1 = embedding_layer(inputs1)
x2 = embedding_layer(inputs2)
x1 = transformer_block(x1)
x2 = transformer_block(x2)
x1 = pooling(x1)
x2 = pooling(x2)
x1 = drop_layer(x1)
x2 = drop_layer(x2)
vec_x1 = out_dense(x1)
vec_x2 = out_dense(x2)
distance = layers.Lambda(euclidean_distance)([vec_x1, vec_x2])
outputs = layers.Dense(1, activation="sigmoid")(distance)
model = keras.Model(inputs=[inputs1, inputs2], outputs=outputs)
【问题讨论】:
标签: tensorflow keras deep-learning neural-network nlp