在对抗训练期间是否应该重复使用 Dropout 掩码？答案

【问题标题】：Should Dropout masks be reused during Adversarial Training?在对抗训练期间是否应该重复使用 Dropout 掩码？
【发布时间】：2018-11-20 14:34:56
【问题描述】：

我正在使用来自Explaining and Harnessing Adversarial Examples 的 FGSM 方法使用自定义损失函数实施对抗性训练：

在tf.keras 中使用自定义损失函数实现，它在概念上如下所示：

model = Sequential([
    ...
])

def loss(labels, logits):
    # Compute the cross-entropy on the legitimate examples
    cross_ent = tf.losses.softmax_cross_entropy(labels, logits)

    # Compute the adversarial examples
    gradients, = tf.gradients(cross_ent, model.input)
    inputs_adv = tf.stop_gradient(model.input + 0.3 * tf.sign(gradients))

    # Compute the cross-entropy on the adversarial examples
    logits_adv = model(inputs_adv)
    cross_ent_adv = tf.losses.softmax_cross_entropy(labels, logits_adv)

    return 0.5 * cross_ent + 0.5 * cross_ent_adv

model.compile(optimizer='adam', loss=loss)
model.fit(x_train, y_train, ...)

这适用于简单的卷积神经网络。

在logits_adv = model(inputs_adv) 调用期间，模型被第二次调用。这意味着，它将使用与 model.inputs 的原始前馈传递不同的 dropout 掩码。然而，inputs_adv 是使用 tf.gradients(cross_ent, model.input) 创建的，即使用来自原始前馈传递的 dropout 掩码。这可能会有问题，因为允许模型使用新的 dropout 掩码可能会削弱对抗批次的影响。

由于在 Keras 中实现 dropout 掩码的重用会很麻烦，所以我对重用掩码的实际效果感兴趣。它有什么不同吗？合法样本和对抗样本的测试准确度如何？

【问题讨论】：

标签： python tensorflow keras neural-network conv-neural-network

【解决方案1】：

我尝试在对抗性训练步骤的前馈传递过程中使用 MNIST 上的简单 CNN 重用 dropout 掩码。我选择了与cleverhans tutorial 中使用的相同的网络架构，在 softmax 层之前增加了一个 dropout 层。

这是结果（红色 = 重复使用 dropout 掩码，蓝色 = 幼稚实现）：

实线表示合法测试示例的准确性。虚线表示在测试集上生成的对抗样本的准确性。

总结，如果您只使用对抗性训练作为正则化器以提高测试准确性本身，那么重复使用 dropout 掩码可能不值得。对于对抗性攻击的鲁棒性，它可能会有所不同。但是，您需要对其他数据集、架构、随机种子等进行进一步的实验，才能做出更自信的陈述。

为了保持上图的可读性，我省略了未经对抗训练训练的模型的对抗测试示例的准确性。数值约为 10%。

您可以在this gist 中找到此实验的代码。使用 TensorFlow 的 Eager 模式，实现存储和重用 dropout 掩码相当简单。

【讨论】：