Tensorflow：global_step 未递增；因此指数衰减不起作用答案

【问题标题】：Tensorflow: global_step not incremented; hence exponentialDecay not workingTensorflow：global_step 未递增；因此指数衰减不起作用
【发布时间】：2016-11-10 06:11:18
【问题描述】：

我正在尝试学习 Tensorflow，我想使用 Tensorflow 的 cifar10 教程框架并在 mnist 之上对其进行训练（结合两个教程）。

在cifar10.py的train方法中：

cifar10.train(total_loss, global_step):
  lr = tf.train.exponential_decay(INITIAL_LEARNING_RATE,                        
                                  global_step,                                  
                                  100,                                          
                                  0.1,                   
                                  staircase=True)                               
  tf.scalar_summary('learning_rate', lr)                                       
  tf.scalar_summary('global_step', global_step)

global_step 被初始化并传入，global_step 确实每一步增加 1，学习率适当衰减，源代码可以在 tensorflow 的 cifar10 教程中找到。

但是，当我尝试对修改后的 mnist.py 的 train 方法代码执行相同操作时：

mnist.training(loss, batch_size, global_step):
  # Decay the learning rate exponentially based on the number of steps.         
  lr = tf.train.exponential_decay(0.1,                                          
                                  global_step,                                  
                                  100,                                             
                                  0.1,                                             
                                  staircase=True)                                  
  tf.scalar_summary('learning_rate1', lr)                                          
  tf.scalar_summary('global_step1', global_step)                                   

  # Create the gradient descent optimizer with the given learning rate.            
  optimizer = tf.train.GradientDescentOptimizer(lr)                                
  # Create a variable to track the global step.                                    
  global_step = tf.Variable(0, name='global_step', trainable=False)                
  # Use the optimizer to apply the gradients that minimize the loss                
  # (and also increment the global step counter) as a single training step.     
  train_op = optimizer.minimize(loss, global_step=global_step)                  
  tf.scalar_summary('global_step2', global_step)                                
  tf.scalar_summary('learning_rate2', lr)      
  return train_op

全局步骤被初始化（在 cifar10 和我的 mnist 文件中）为：

with tf.Graph().as_default(): 
  global_step = tf.Variable(0, trainable=False)
  ...
  # Build a Graph that trains the model with one batch of examples and           
  # updates the model parameters.                                                
  train_op = mnist10.training(loss, batch_size=100,                 
                              global_step=global_step)

这里，我记录了两次全局步长和学习率的scalar_summary： learning_rate1 和 learning_rate2 都相同且恒定为 0.1（初始学习率）。 global_step1 在 2000 步中也恒定为 0。 global_step2 每一步线性增加 1。

更详细的代码结构见： https://bitbucket.org/jackywang529/tesorflow-sandbox/src

我很困惑为什么会出现这种情况（在我的 global_step 的情况下，因为我认为一切都是象征性地设置的，所以一旦程序开始运行，无论我在哪里写摘要，全局步骤都应该递增) 我认为这就是为什么我的学习率是恒定的。当然，我可能犯了一些简单的错误，并且很高兴得到帮助/解释。

global_steps written before and after the minimize function is called

【问题讨论】：

cifar10 教程位于github.com/tensorflow/tensorflow/blob/master/tensorflow/models/…。抱歉，由于我在网站上的排名较低，无法添加链接。

标签： python tensorflow

【解决方案1】：

您将一个名为global_step 的参数传递给mnist.training，并且还在mnist.training 中创建了一个名为global_step 的变量。用于跟踪exponential_decay 的变量是传入的变量，但实际递增的变量（通过传递给optimizer.minimize）是新创建的变量。只需从mnist.training 中删除以下语句，就可以了：

global_step = tf.Variable(0, name='global_step', trainable=False)

【讨论】：

我很抱歉我没有注意，但非常感谢！！！对框架感到困惑。