【问题标题】:Tensorflow batch_norm does not work properly when testing (is_training=False)Tensorflow batch_norm 在测试时无法正常工作(is_training=False)
【发布时间】:2017-08-03 21:21:09
【问题描述】:

我正在训练以下模型:

with slim.arg_scope(inception_arg_scope(is_training=True)):
    logits_v, endpoints_v = inception_v3(all_v, num_classes=25, is_training=True, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='vis')
    logits_p, endpoints_p = inception_v3(all_p, num_classes=25, is_training=True, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='pol')
    pol_features = endpoints_p['pol/features']
    vis_features = endpoints_v['vis/features']

eps = 1e-08
loss = tf.sqrt(tf.maximum(tf.reduce_sum(tf.square(pol_features - vis_features), axis=1, keep_dims=True), eps))

# rest of code
saver = tf.train.Saver(tf.global_variables())

在哪里

def inception_arg_scope(weight_decay=0.00004,
                    batch_norm_decay=0.9997,
                    batch_norm_epsilon=0.001, is_training=True):
normalizer_params = {
    'decay': batch_norm_decay,
    'epsilon': batch_norm_epsilon,
    'is_training': is_training
}
normalizer_fn = tf.contrib.layers.batch_norm

# Set weight_decay for weights in Conv and FC layers.
with slim.arg_scope([slim.conv2d, slim.fully_connected],
                    weights_regularizer=slim.l2_regularizer(weight_decay)):
    with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=is_training):
        with slim.arg_scope(
                [slim.conv2d],
                weights_initializer=slim.variance_scaling_initializer(),
                activation_fn=tf.nn.relu,
                normalizer_fn=normalizer_fn,
                normalizer_params=normalizer_params) as sc:
            return sc

inception_V3 定义为here。 我的模型训练得很好,损失从 60 到小于 1。但是当我想在另一个文件中测试模型时:

with slim.arg_scope(inception_arg_scope(is_training=False)):
    logits_v, endpoints_v = inception_v3(all_v, num_classes=25, is_training=False, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='vis')
    logits_p, endpoints_p = inception_v3(all_p, num_classes=25, is_training=False, dropout_keep_prob=0.8,
                     spatial_squeeze=True, reuse=reuse_variables, scope='pol')

它给了我毫无意义的结果,或者更准确地说,所有训练和测试样本的损失是1e-8。当我更改is_training=True 时,它会给出更多合乎逻辑的结果,但损失仍然大于训练阶段(即使我正在测试训练数据) 我对 VGG16 也有同样的问题。当我使用没有 batch_norm 的 VGG 时,我的测试准确率为 %100,而当我使用 batch_norm 时,我的测试准确率为 0%。

我在这里缺少什么? 谢谢,

【问题讨论】:

  • 批量规范层中decay 的设置是什么?如果是decay=0.999,请尝试将其增加到decay=0.99decay=0.9,看看是否能解决您的问题
  • 我的错误是在应用apply_gradient_op 时缺少“batchnorm_updates_op”作为依赖项。在我修复之后,我将衰减减少到0.9,并且它在测试时间内不起作用(巨大的损失)。然后我选择0.99 并且它起作用了。还是谢谢你...

标签: testing tensorflow batch-normalization


【解决方案1】:

我遇到了同样的问题并解决了。当您使用slim.batch_norm 时,请务必使用slim.learning.create_train_op 而不是tf.train.GradientDecentOptimizer(lr).minimize(loss) 或其他优化器。试试看它是否有效!

【讨论】:

    猜你喜欢
    • 2018-03-11
    • 2017-01-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-09-06
    相关资源
    最近更新 更多