在 TensorFlow 中实现对比损失和三元组损失答案

【问题标题】：Implementing contrastive loss and triplet loss in Tensorflow在 TensorFlow 中实现对比损失和三元组损失
【发布时间】：2016-07-08 06:20:37
【问题描述】：

两天前我开始玩 TensorFlow，我想知道是否实现了三元组和对比损失。

我一直在查看the documentation，但我没有找到任何关于这些事情的示例或描述。

【问题讨论】：

标签： tensorflow deep-learning

【解决方案1】：

更新（2018/03/19）：我写了一个blog post，详细介绍了如何在 TensorFlow 中实现三元组损失。

您需要自己实现对比损失或三元组损失，但是一旦您知道了对或三元组，这很容易。

对比损失

假设您有成对的数据及其标签（正面或负面，即同一类或不同类）作为输入。例如，您有尺寸为 28x28x1 的图像作为输入：

left = tf.placeholder(tf.float32, [None, 28, 28, 1])
right = tf.placeholder(tf.float32, [None, 28, 28, 1])
label = tf.placeholder(tf.int32, [None, 1]). # 0 if same, 1 if different
margin = 0.2

left_output = model(left)  # shape [None, 128]
right_output = model(right)  # shape [None, 128]

d = tf.reduce_sum(tf.square(left_output - right_output), 1)
d_sqrt = tf.sqrt(d)

loss = label * tf.square(tf.maximum(0., margin - d_sqrt)) + (1 - label) * d

loss = 0.5 * tf.reduce_mean(loss)

三元组损失

与对比损失相同，但使用三元组（锚、正、负）。这里不需要标签。

anchor_output = ...  # shape [None, 128]
positive_output = ...  # shape [None, 128]
negative_output = ...  # shape [None, 128]

d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)

loss = tf.maximum(0., margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)

在 TensorFlow 中实现三元组损失或对比损失时的真正麻烦是如何对三元组或对进行采样。我将专注于生成三元组，因为它比生成对更难。

最简单的方法是在 Tensorflow 图之外生成它们，即在 python 中，并通过占位符将它们提供给网络。基本上，您一次选择 3 张图像，前两张来自同一个班级，第三张来自另一个班级。然后我们对这些三元组执行前馈，并计算三元组损失。

这里的问题是生成三元组很复杂。我们希望它们是有效的三元组，具有正损失的三元组（否则损失为 0 并且网络无法学习）。
要知道三元组是否好，您需要计算它的损失，因此您已经通过网络进行了前馈...

显然，在 Tensorflow 中实现三元组损失很困难，并且有一些方法可以使其比在 python 中的采样更有效，但解释它们需要整篇博文！

【讨论】：

嗨@Olivier，我对采样部分很感兴趣。你会或者你为它发布了一个博客吗？我正在按照你说的做，前馈一次，计算所有可能的三元组的损失，过滤掉无效的，并对一批进行采样以进行另一个前向+后向...
没有写任何博文。一个关键的见解是计算所有可能的三元组，如OpenFace 中所述，我上面的答案包含旧的解决方案。要删除中间的sess.run() 调用，您可以在图中添加tf.py_func 操作以过滤掉坏的三元组。
@weitang114：第二部分的另一种方法是只计算 all 三元组的损失，只删除无效的三元组（即 (+, +, +)），可以提前计算。这收敛得很好，令人惊讶。
感谢您的建议。那一刻我没有得到这个想法，但最近发现它非常有用。在 tf 中实施的这个过程帮助我将培训时间从 5 天减少到 1 天。 :)
@HelloLili：我终于写了那篇博文。这是：omoindrot.github.io/triplet-loss

【解决方案2】：

Triplet loss with semihardnegative mining现在在tf.contrib中实现，如下：

triplet_semihard_loss(
    labels,
    embeddings,
    margin=1.0
)

地点：

参数：

标签：具有多类形状 [batch_size] 的一维 tf.int32 张量整数标签。
嵌入：嵌入向量的二维浮点张量。嵌入应该进行 l2 归一化。
margin：浮点数，损失定义中的margin术语。

返回：

triplet_loss：tf.float32 标量。

【讨论】：

仅链接答案？在此处包含链接中的一些相关部分。
虽然此链接可能会提供一些有限的即时帮助，但请回答should include sufficient context around the link，这样您的其他用户就会知道它是什么以及它为什么存在。始终引用重要链接中最相关的部分，以使其对有其他类似问题的未来读者更有用。此外，其他用户倾向于对barely more than a link to an external site 和might be deleted 的答案做出负面回应。

【解决方案3】：

Tiago，我不认为你使用的是 Olivier 给出的相同公式。这是正确的代码（不确定它是否会起作用，只是修复公式）：

def compute_euclidean_distance(x, y):
    """
    Computes the euclidean distance between two tensorflow variables
    """

    d = tf.reduce_sum(tf.square(tf.sub(x, y)),1)
    return d


def compute_contrastive_loss(left_feature, right_feature, label, margin):

    """
    Compute the contrastive loss as in


    L = 0.5 * Y * D^2 + 0.5 * (Y-1) * {max(0, margin - D)}^2

    **Parameters**
     left_feature: First element of the pair
     right_feature: Second element of the pair
     label: Label of the pair (0 or 1)
     margin: Contrastive margin

    **Returns**
     Return the loss operation

    """

    label = tf.to_float(label)
    one = tf.constant(1.0)

    d = compute_euclidean_distance(left_feature, right_feature)
    d_sqrt = tf.sqrt(compute_euclidean_distance(left_feature, right_feature))
    first_part = tf.mul(one-label, d)# (Y-1)*(d)

    max_part = tf.square(tf.maximum(margin-d_sqrt, 0))
    second_part = tf.mul(label, max_part)  # (Y) * max(margin - d, 0)

    loss = 0.5 * tf.reduce_mean(first_part + second_part)

    return loss

【讨论】：

您好 Wasssim，感谢您的修复，只是代码中的一个补丁。 d_sqrt = tf.sqrt(compute_euclidean_distance(left_feature, right_feature)) 但即使有这个修复，我的准确率也很低（但损失会按预期减少）。
@TiagoFreitasPereira 我的三元组丢失实现也有同样的问题。如果我找到解决方案，我会通知您...
嘿@Wassim，谢谢。如果更简单，您可以尝试引导我的项目 (github.com/tiagofrepereira2012/examples.tensorflow)。
@TiagoFreitasPereira ，这似乎与我们实现精度计算的方式有关。看起来在使用 Triplet Loss 或 Contrastive Loss 时，您无法使用标签验证来计算准确度（因为网络没有经过训练以区分 10 个类别），但是，您必须通过评估网络是否猜测两个元素是是否来自同一个班级。
见本文arxiv.org/pdf/1503.03832v3.pdf的第4和5.6节@