如何在 TensorFlow 2.0 中实现 clip_gradients_by_norm？答案

【问题标题】：How to implement clip_gradients_by_norm in TensorFlow 2.0?如何在 TensorFlow 2.0 中实现 clip_gradients_by_norm？
【发布时间】：2019-06-03 13:44:34
【问题描述】：

我想像在 TF 1.3 下一样在 TF 2.0 中使用 tf.contrib.estimator.clip_gradients_by_norm，但是现在 contrib 消失了，我需要一个解决方法，甚至只是一些关于它如何工作的基本直觉。

我知道此问题已在 Github (https://github.com/tensorflow/tensorflow/issues/28707) 上作为问题提出，但如果可能的话希望尽快找到解决方案。

# Use gradient descent as the optimizer for training the model.
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)

# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)

https://colab.research.google.com/notebooks/mlcc/first_steps_with_tensor_flow.ipynb?utm_source=mlcc&utm_campaign=colab-external&utm_medium=referral&utm_content=firststeps-colab&hl=en#scrollTo=ubhtW-NGU802

我已尝试使用此处描述的自定义渐变： https://www.tensorflow.org/guide/eager

@tf.custom_gradient
def clip_gradient_by_norm(x, norm):
  y = tf.identity(x)
  def grad_fn(dresult):
    return [tf.clip_by_norm(dresult, norm), None]
  return y, grad_fn

没有成功。

【问题讨论】：

标签： python tensorflow tensorflow2.0

【解决方案1】：

正在查看对此问题的评论https://github.com/tensorflow/tensorflow/issues/28707#issuecomment-502336827，

我发现您可以将代码修改为如下所示：

# Use gradient descent as the optimizer for training the model.
from tensorflow.keras import optimizers
my_optimizer = optimizers.SGD(lr=0.0000001, clipnorm=5.0)

# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)

代替：

# Use gradient descent as the optimizer for training the model.
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)

# Configure the linear regression model with our feature columns and optimizer.
# Set a learning rate of 0.0000001 for Gradient Descent.
linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)

【讨论】：