【发布时间】:2018-03-16 08:21:07
【问题描述】:
我对张量流中的tf.layers.batch_normalization 感到困惑。
我的代码如下:
def my_net(x, num_classes, phase_train, scope):
x = tf.layers.conv2d(...)
x = tf.layers.batch_normalization(x, training=phase_train)
x = tf.nn.relu(x)
x = tf.layers.max_pooling2d(...)
# some other staffs
...
# return
return x
def train():
phase_train = tf.placeholder(tf.bool, name='phase_train')
image_node = tf.placeholder(tf.float32, shape=[batch_size, HEIGHT, WIDTH, 3])
images, labels = data_loader(train_set)
val_images, val_labels = data_loader(validation_set)
prediction_op = my_net(image_node, num_classes=2,phase_train=phase_train, scope='Branch1')
loss_op = loss(...)
# some other staffs
optimizer = tf.train.AdamOptimizer(base_learning_rate)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = optimizer.minimize(loss=total_loss, global_step=global_step)
sess = ...
coord = ...
while not coord.should_stop():
image_batch, label_batch = sess.run([images, labels])
_,loss_value= sess.run([train_op,loss_op], feed_dict={image_node:image_batch,label_node:label_batch,phase_train:True})
step = step+1
if step==NUM_TRAIN_SAMPLES:
for _ in range(NUM_VAL_SAMPLES/batch_size):
image_batch, label_batch = sess.run([val_images, val_labels])
prediction_batch = sess.run([prediction_op], feed_dict={image_node:image_batch,label_node:label_batch,phase_train:False})
val_accuracy = compute_accuracy(...)
def test():
phase_train = tf.placeholder(tf.bool, name='phase_train')
image_node = tf.placeholder(tf.float32, shape=[batch_size, HEIGHT, WIDTH, 3])
test_images, test_labels = data_loader(test_set)
prediction_op = my_net(image_node, num_classes=2,phase_train=phase_train, scope='Branch1')
# some staff to load the trained weights to the graph
saver.restore(...)
for _ in range(NUM_TEST_SAMPLES/batch_size):
image_batch, label_batch = sess.run([test_images, test_labels])
prediction_batch = sess.run([prediction_op], feed_dict={image_node:image_batch,label_node:label_batch,phase_train:False})
test_accuracy = compute_accuracy(...)
培训似乎运作良好,val_accuracy 是合理的(比如0.70)。问题是:当我尝试使用训练好的模型进行测试时(即test函数),如果phase_train设置为False,则test_accuracy非常低(比如0.000270 ),但是当phase_train 设置为True 时,test_accuracy 似乎是正确的(比如0.69)。
据我了解,phase_train 在测试阶段应该是False,对吧?
我不确定问题是什么。我误解了批量标准化吗?
【问题讨论】:
-
嗨@Drop,感谢您的评论。是的,我在
train函数中添加了update_ops的依赖项。但错误依然存在。 -
设置
training=False是正确的。问题可能不在于批量标准化。你确定你正确加载了模型检查点吗? -
嗨,@KathyWu,感谢您的评论。是的,我认为加载是正确的。因为我也试过没有BN的模型。模型加载正确,预测合理。对于
tf.layers.batch_normalization层,它有两个参数:beta和gamma。使用 BN 时,我还加载了scopt/batch_normalization_1/beta:0和scope/batch_normalization_1/gamma:0。问题是当我将phase_train设置为True时,测试阶段的预测是合理的。但总的来说,phase_train应该是False。 -
@mining 添加
... with tf.control_dependencies(update_ops): ...后,phase_train = False在测试阶段可以正常工作。
标签: tensorflow batch-normalization