在 tensorflow 中通过自定义层传播一次答案

【问题标题】：Propagating through a custom layer in tensorflow just once在 tensorflow 中通过自定义层传播一次
【发布时间】：2019-08-25 15:37:41
【问题描述】：

给定 tensorflow 中的自定义层，是否可以让模型仅在一个时期内使用它？对于所有其他时期，该层可能会被忽略，或者只是一个身份。

例如：给定数据，我希望层简单地将给定数据加倍。其他层应该可以正常工作。如何做到这一点？

def do_stuff(data):
      return 2*data

def run_once(data):
  return tf.py_func(do_stuff, 
                     [data],
                     'float32',
                     stateful=False,
                     name='I run once')


class CustomLayer(Layer):
  def __init__(self, output_dim, **kwargs):
    self.output_dim = output_dim
    self.trainable = False
    super(CustomLayer, self).__init__(**kwargs)

  def call(self, x):
    res = tf.map_fn(run_once, x)
    res.set_shape([x.shape[0],
                   self.output_dim[1], 
                   self.output_dim[0],
                   x.shape[-1]])
    return res

inputs = Input(shape=(224, 224, 1))    
x = Lambda(preprocess_input(x), input_shape=(224, 224, 1), output_shape=(224, 224, 3))
outputs = Dense(1)(x)
model = Model(input=inputs, output=outputs)
output = model(x)

【问题讨论】：

标签： tensorflow keras neural-network layer

【解决方案1】：

有趣的问题。要在第一个 epoch 中执行 TF 操作，可以使用 tf.cond 和 tf.control_dependencies 来检查/更新布尔张量的值。例如，您的自定义层可以按如下方式实现：

class CustomLayer(Layer):
    def __init__(self, **kwargs):
        super(CustomLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        self.first_epoch = tf.Variable(True)

    def call(self, x):
        res = tf.cond(self.first_epoch,
                      true_fn=lambda: run_once(x),
                      false_fn=lambda: x)
        with tf.control_dependencies([res]):
            assign_op = self.first_epoch.assign(False)
            with tf.control_dependencies([assign_op]):
                res = tf.identity(res)
        return res

要验证此层是否按预期工作，请将run_once 定义为：

def run_once(data):
    print_op = tf.print('First epoch')
    with tf.control_dependencies([print_op]):
        out = tf.identity(data)
    return out

【讨论】：

这看起来很有趣！但是有一个疑问：如果在预测过程中使用这个层会发生什么？对于培训，让每个批次都通过您的“运行一次”路径很简单。该层可能只计算到目前为止接收到的数据或批次的数量，并在固定数量的步骤后停止。（事先确实知道。）保存模型，让其他人使用它进行预测可能是另一回事：/
如果在预测期间使用此层，则行为将取决于first_epoch 的状态（实际上，first_batch 可能是一个更好的名称）。如果网络已经过训练，状态将为False，run_once 不会在预测时执行。反之亦然。我不确定预测时的预期行为是什么，但可以通过将learning phase 合并到逻辑中来轻松修改它。至于保存模型，应该没有问题，因为标志first_epoch是可以恢复的变量。