【发布时间】:2017-12-30 07:00:40
【问题描述】:
我想构造一个Caffe风格的L2-norm层(好吧,我实际上想在pycaffe层中使用Tensorflow,因为使用CUDA在Caffe中写入.cu文件是一项繁重的任务。)
前传:
- 输入(x):n 维数组
- 输出(y):具有相同输入形状的 n 维数组
- 操作:
y = x / sqrt(sum(x^2,axis=(0,1))) # channel wise L2 normalization
class L2NormLayer:
def __init__(self):
self.eps = 1e-12
self.sess = tf.Session()
def forward(self, in_x):
self.x = tf.constant(in_x)
self.xp2 = tf.pow(self.x, 2)
self.sum_xp2 = tf.reduce_sum(self.xp2, axis=(0, 1))
self.sqrt_sum_xp2 = tf.sqrt(self.sum_xp2 + self.eps)
self.hat = tf.div(self.x, self.sqrt_sum_xp2)
return self.sess.run(self.hat)
def backward(self, dl):
# 'dl' is loss calculated at upper layer (chain rule)
# how do I calculate this gradient automatically using Tensorflow
# hand-craft backward version
loss = tf.constant(dl)
d_x1 = tf.div(loss, self.sqrt_sum_xp2)
d_sqrt_sum_xp2 = tf.div(-tf.reduce_sum(self.x * dl, axis=(0, 1)), (self.eps + tf.pow(self.sqrt_sum_xp2, 2)))
d_sum_xp2 = tf.div(d_sqrt_sum_xp2, (self.eps + 2 * tf.sqrt(self.sum_xp2)))
d_xp2 = tf.ones_like(self.xp2) * d_sum_xp2
d_x2 = 2 * self.x * d_xp2
d_x = d_x1 + d_x2
return self.sess.run(d_x)
如代码中所述,如何自动使用Tensorflow 计算前向传递函数的梯度?
【问题讨论】:
标签: tensorflow neural-network deep-learning caffe gradient-descent