【发布时间】:2019-10-22 00:35:44
【问题描述】:
我正在尝试训练一个连体模型来预测写在两个图像中的单词是否相同。除了这个模型之外,还应该能够区分两个人的写作。问题类似于签名验证问题。
我的基础网络如下所示:
def create_base_network_signet(input_shape):
'''Base Siamese Network'''
seq = Sequential()
seq.add(Conv2D(96, kernel_size=(7,7), strides=2, input_shape= input_shape, activation='relu'))
seq.add(BatchNormalization())
seq.add(ZeroPadding2D(padding=(2, 2)))
seq.add(Conv2D(96, kernel_size=(7,7), strides=1, activation='relu'))
seq.add(BatchNormalization())
seq.add(MaxPooling2D(pool_size=(3, 3), strides=2))
seq.add(ZeroPadding2D(padding=(1, 1)))
seq.add(Conv2D(128, kernel_size=(5,5), strides=1, activation='relu'))
seq.add(Conv2D(128, kernel_size=(5,5), strides=1, activation='relu'))
seq.add(MaxPooling2D(pool_size=(3, 3), strides=2))
seq.add(Dropout(0.3))
seq.add(ZeroPadding2D(padding=(1, 1)))
seq.add(Conv2D(384, kernel_size=(3,3), strides=1, activation='relu'))
seq.add(Conv2D(256, kernel_size=(3,3), strides=1, activation='relu'))
seq.add(BatchNormalization())
seq.add(MaxPooling2D(pool_size=(3,3), strides=2))
seq.add(Dropout(0.3))
seq.add(ZeroPadding2D(padding=(1,1)))
seq.add(Conv2D(128, kernel_size=(2,2), strides=1, activation='relu'))
seq.add(Dropout(0.3))
seq.add(Flatten(name='flatten'))
seq.add(Dense(1024, W_regularizer=l2(0.0005), activation='relu', init='glorot_uniform'))
seq.add(Dropout(0.4))
seq.add(Dense(128, W_regularizer=l2(0.0005), activation='relu', init='glorot_uniform')) # softmax changed to relu
return seq
最终模型(对比损失):
base_network = create_base_network_signet(input_shape)
input_a = Input(shape=(input_shape), name="first")
input_b = Input(shape=(input_shape), name="second")
processed_a = base_network(input_a)
processed_b = base_network(input_b)
distance = Lambda(euclidean_distance, output_shape=eucl_dist_output_shape)([processed_a, processed_b])
model = Model(input=[input_a, input_b], output=distance)
除了这个模型,我还尝试了其他更简单的模型作为基础模型。我还尝试将 VGG16 和 Inception 等模型作为基础模型进行训练。在训练所有这些模型时,我遇到了同样的问题。模型最终学习将输入图像编码为一个零向量。
我尝试过三元组损失和对比损失来训练模型。两者最终都有预测零的相同问题。对比损失函数取自 keras 教程。而triplet loss定义为:
def triplet_loss(y_true, y_pred, alpha = 0.5):
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), axis=-1)
neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), axis=-1)
basic_loss = tf.add(tf.subtract(pos_dist, neg_dist), alpha)
loss = tf.reduce_sum(tf.maximum(basic_loss, 0.0))
return loss
我还想提一下,当我使用binary_crossentropy 损失函数训练我的模型时。模型开始学习编码。但是,在准确率达到 82% 左右之后,准确率就停止了提高,而损失却在不断下降。
这是在三元组损失和对比损失的情况下输出编码的样子:
我的训练数据如下所示:
【问题讨论】:
标签: python machine-learning keras deep-learning similarity