【问题标题】:How to build a Siamese Network from Transformer Model? Shape Input Error如何从 Transformer 模型构建 Siamese 网络?形状输入错误
【发布时间】:2022-01-13 11:22:42
【问题描述】:

我有以下基础网络,其中包含一些重要的(由于这些原因会出现错误)参数(请假设所有其他参数)

maxlen = 250
model_dense = 256

基础模型:

def build_base_model(inputs):
  inputs = layers.Input(shape=(maxlen,),name='base_input')
  embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)
  x = embedding_layer(inputs)
  transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim, trans_drop1, trans_drop2, trans_reg1, trans_reg2)
  x = transformer_block(x)
  x = layers.GlobalAveragePooling1D()(x)
  x = layers.Dropout(model_drop1)(x)
  outputs = layers.Dense(model_dense)(x)
  base_model = keras.Model(inputs=inputs, outputs=outputs)
  return base_model

我的Siamese 网络为:

base_model = build_base_model()
input_text1 = layers.Input(shape=(maxlen,))
input_text2 = layers.Input(shape=(maxlen,))

emb1 = base_model(input_text1)
emb2 = base_model(input_text2)

distance = layers.Lambda(euclidean_distance)([emb1, emb2])
outputs = layers.Dense(1, activation="sigmoid")(distance)
model = keras.Model(inputs=[emb1, emb2], outputs=outputs)

model.compile(
    optimizer="adam", metrics = ["accuracy",], loss= 'binary_crossentropy')

history = model.fit(
    train_X, train_y, batch_size=batch_size, epochs = 50, validation_split = 0.15, callbacks = callbacks, verbose = 1,
)

它给了我一个错误:

ValueError: Input 0 of layer "model_11" is incompatible with the layer: expected shape=(None, 256), found shape=(None, 250)

我做错了什么?

  1. Base Transformer model tutorial taken from this
  2. Siamese Model Structure, cosine distance, make_pairs from this

更新- 我以不同的方式构建了新网络,并且它已启动并运行。有人可以确认它是否正确:

inputs1 = layers.Input(shape=(maxlen,),name='inp_1')
inputs2 = layers.Input(shape=(maxlen,),name='inp_2')
embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)
transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim, trans_drop1, trans_drop2, trans_reg1, trans_reg2)
pooling = layers.GlobalAveragePooling1D()
drop_layer = layers.Dropout(model_drop1)
out_dense = layers.Dense(model_dense)

x1 = embedding_layer(inputs1)
x2 = embedding_layer(inputs2)

x1 = transformer_block(x1)
x2 = transformer_block(x2)

x1 = pooling(x1)
x2 = pooling(x2)


x1 = drop_layer(x1)
x2 = drop_layer(x2)

vec_x1 = out_dense(x1)
vec_x2 = out_dense(x2)


distance = layers.Lambda(euclidean_distance)([vec_x1, vec_x2])
outputs = layers.Dense(1, activation="sigmoid")(distance)
model = keras.Model(inputs=[inputs1, inputs2], outputs=outputs)

【问题讨论】:

    标签: tensorflow keras deep-learning neural-network nlp


    【解决方案1】:

    model = keras.Model(inputs=[emb1, emb2], outputs=outputs)行中: 我怀疑你是故意说model = keras.Model(inputs=[input_text1, input_text2], outputs=outputs)

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-01-26
      • 2017-12-05
      • 2020-03-16
      • 1970-01-01
      • 1970-01-01
      • 2019-12-16
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多