【发布时间】:2016-01-21 21:05:01
【问题描述】:
我想为我的 TensorFlow 图中的几个权重矩阵添加一个最大范数约束,这是 Torch 的 renorm 方法。
如果任何神经元的权重矩阵的 L2 范数超过 max_norm,我想缩小它的权重,使其 L2 范数正好是 max_norm。
使用 TensorFlow 表达这一点的最佳方式是什么?
【问题讨论】:
标签: python tensorflow
我想为我的 TensorFlow 图中的几个权重矩阵添加一个最大范数约束,这是 Torch 的 renorm 方法。
如果任何神经元的权重矩阵的 L2 范数超过 max_norm,我想缩小它的权重,使其 L2 范数正好是 max_norm。
使用 TensorFlow 表达这一点的最佳方式是什么?
【问题讨论】:
标签: python tensorflow
这是一个可能的实现:
import tensorflow as tf
def maxnorm_regularizer(threshold, axes=1, name="maxnorm", collection="maxnorm"):
def maxnorm(weights):
clipped = tf.clip_by_norm(weights, clip_norm=threshold, axes=axes)
clip_weights = tf.assign(weights, clipped, name=name)
tf.add_to_collection(collection, clip_weights)
return None # there is no regularization loss term
return maxnorm
您将如何使用它:
from tensorflow.contrib.layers import fully_connected
from tensorflow.contrib.framework import arg_scope
with arg_scope(
[fully_connected],
weights_regularizer=max_norm_regularizer(1.5)):
hidden1 = fully_connected(X, 200, scope="hidden1")
hidden2 = fully_connected(hidden1, 100, scope="hidden2")
outputs = fully_connected(hidden2, 5, activation_fn=None, scope="outs")
max_norm_ops = tf.get_collection("max_norm")
[...]
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
for X_batch, y_batch in load_next_batch():
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
sess.run(max_norm_ops)
这将创建一个 3 层神经网络,并在每一层使用最大范数正则化对其进行训练(阈值为 1.5)。我刚试了一下,好像可以。希望这可以帮助!欢迎提出改进建议。 :)
备注
此代码基于tf.clip_by_norm():
>>> x = tf.constant([0., 0., 3., 4., 30., 40., 300., 400.], shape=(4, 2))
>>> print(x.eval())
[[ 0. 0.]
[ 3. 4.]
[ 30. 40.]
[ 300. 400.]]
>>> clip_rows = tf.clip_by_norm(x, clip_norm=10, axes=1)
>>> print(clip_rows.eval())
[[ 0. 0. ]
[ 3. 4. ]
[ 6. 8. ] # clipped!
[ 6.00000048 8. ]] # clipped!
如果需要,您还可以剪切列:
>>> clip_cols = tf.clip_by_norm(x, clip_norm=350, axes=0)
>>> print(clip_cols.eval())
[[ 0. 0. ]
[ 3. 3.48245788]
[ 30. 34.82457733]
[ 300. 348.24578857]]
# clipped!
【讨论】:
使用 Rafał 的建议和 TensorFlow 的 implementation 或 clip_by_norm,这是我想出的:
def renorm(x, axis, max_norm):
'''Renormalizes the sub-tensors along axis such that they do not exceed norm max_norm.'''
# This elaborate dance avoids empty slices, which TF dislikes.
rank = tf.rank(x)
bigrange = tf.range(-1, rank + 1)
dims = tf.slice(
tf.concat(0, [tf.slice(bigrange, [0], [1 + axis]),
tf.slice(bigrange, [axis + 2], [-1])]),
[1], rank - [1])
# Determine which columns need to be renormalized.
l2norm_inv = tf.rsqrt(tf.reduce_sum(x * x, dims, keep_dims=True))
scale = max_norm * tf.minimum(l2norm_inv, tf.constant(1.0 / max_norm))
# Broadcast the scalings
return tf.mul(scale, x)
它似乎具有二维矩阵所需的行为,应该 推广到张量:
> x = tf.constant([0., 0., 3., 4., 30., 40., 300., 400.], shape=(4, 2))
> print x.eval()
[[ 0. 0.] # rows have norms of 0, 5, 50, 500
[ 3. 4.] # cols have norms of ~302, ~402
[ 30. 40.]
[ 300. 400.]]
> print renorm(x, 0, 10).eval()
[[ 0. 0. ] # unaffected
[ 3. 4. ] # unaffected
[ 5.99999952 7.99999952] # rescaled
[ 6.00000048 8.00000095]] # rescaled
> print renorm(x, 1, 350).eval()
[[ 0. 0. ] # col 0 is unaffected
[ 3. 3.48245788] # col 1 is rescaled
[ 30. 34.82457733]
[ 300. 348.24578857]]
【讨论】:
看看clip_by_norm 函数,它就是这样做的。它以单个张量作为输入并返回一个按比例缩小的张量。
【讨论】:
clip_by_norm 中的reduce_sum call 替换为设置reduction_indices 的东西?
reduction_axis = 1 # or 0 for rowsreduce_sum(t * t, reduction_axis, keep_dims=True)
[0, 1, 3, 4]这样的列表:缺少一个元素的范围。