在张量流中使用 metric.Mean()答案

【问题标题】：Using metric.Mean() in tensorflow在张量流中使用 metric.Mean()
【发布时间】：2019-05-26 05:00:32
【问题描述】：

我正在学习 Google Colabs 中的 tensorflow 教程，并按照教程在以下链接中指定的内容运行了所有内容：

https://www.tensorflow.org/tutorials/eager/custom_training_walkthrough

我正在运行以下代码：

## Note: Rerunning this cell uses the same model variables

# keep results for plotting
train_loss_results = []
train_accuracy_results = []

num_epochs = 201

for epoch in range(num_epochs):
  epoch_loss_avg = tf.metrics.Mean()
  epoch_accuracy = tf.metrics.Accuracy()

  # Training loop - using batches of 32
  for x, y in train_dataset:
    # Optimize the model
    loss_value, grads = grad(model, x, y)
    optimizer.apply_gradients(zip(grads, model.variables),
                              global_step)

    # Track progress
    epoch_loss_avg(loss_value)  # add current batch loss
    # compare predicted label to actual label
    epoch_accuracy(tf.argmax(model(x), axis=1, output_type=tf.int32), y)

  # end epoch
  train_loss_results.append(epoch_loss_avg.result())
  train_accuracy_results.append(epoch_accuracy.result())

  if epoch % 50 == 0:
    print("Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(epoch,
                                                                epoch_loss_avg.result(),
                                                                epoch_accuracy.result()))

但是当我运行它时，我收到以下错误：

AttributeError: module 'tensorflow._api.v1.metrics' has no attribute 'Mean'

据我了解，他们试图在代码中执行的操作是将 tf.metrics.Mean() 的函数分配给 epoch_loss_avg，然后在 epoch_loss_avg(loss_value) 中进一步应用。所以我在想，自从编写本教程以来，Tensorflow 中可能发生了一些变化，所以我尝试将其重写如下：

## Note: Rerunning this cell uses the same model variables

# Keep results for plotting
train_loss_results = []
train_accuracy_result = []

num_epochs = 201

for epoch in range(num_epochs):
  #epoch_loss_avg = tf.metrics.Mean()
  #epoch_accuracy = tf.metrics.Accuracy()

  # Training loop - using batches of 32
  for x, y in train_dataset:
    # Optimize the model
    loss_value, grads = grad(model, x, y)
    optimizer.apply_gradients(zip(grads, model.variables),
                             global_step)

    # Track progress
    mean_temp = tf.metrics.mean(loss_value) # Add current batch loss
    # Compare the predicted label to actual label
    acc_temp = tf.metrics.accuracy(tf.argmax(model(x), axis = 1, output_type = tf.int32), y)

  # End epoch
  train_loss_results.append(mean_temp)
  train_accuracy_results.append(acc_temp)

  if epoch % 50 == 0:
    print("Epoch {:03d}: Loss: {:,3f}, Accuracy: {:.3f}".format(epoch,
                                                               epoch_loss_avg.result(),
                                                               epoch_accuracy.result()))

函数只是直接运行的地方，但现在我收到另一条错误消息：

RuntimeError: tf.metrics.mean is not supported when eager execution is enabled.

所以我的问题是，是否有另一种编写方式来获得相同的结果，我对正在发生的事情的解释是否正确，如果不正确，会发生什么？

谢谢

【问题讨论】：

看起来这是教程中的一个错误 -- github.com/googlecolab/colabtools/issues/…

标签： python-3.x tensorflow google-colaboratory

【解决方案1】：

为了使用 Eager Execution，您需要将 tf.metrics.Mean 和 tf.metrics.Accuracy 更改为：

epoch_loss_avg = tf.contrib.eager.metrics.Mean()
epoch_accuracy = tf.contrib.eager.metrics.Accuracy()

还有tf.Variable 到：

global_step = tf.contrib.eager.Variable(0)

据我了解，他们试图在代码中执行的操作是将 tf.metrics.Mean() 的函数分配给 epoch_loss_avg，然后在 epoch_loss_avg(loss_value) 中进一步应用。

是的，在epoch_loss_avg = tf.metrics.Mean() 行中，他们创建了计算平均值的操作，然后在epoch_loss_avg(loss_value) 行中累积了批次的损失。因此，在 epoch 结束时，考虑到数据集中的所有批次，我们将有一个平均损失，这会导致该 epoch 的损失（行 epoch_loss_avg.result()）。

关于第二个错误：如您所见，如果启用了急切执行，tf.metrics.mean 将引发RuntimeError。您需要改用tf.contrib.eager.metrics。

【讨论】：