在 keras 中实现复杂的激活函数答案

【问题标题】：Implementing a complicated activation function in keras在 keras 中实现复杂的激活函数
【发布时间】：2018-07-25 08:09:08
【问题描述】：

我刚刚读到一篇有趣的论文：A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks。

我想尝试在 Keras 中实现这个激活功能。我之前已经实现了自定义激活，例如正弦激活：

def sin(x):
  return K.sin(x)
get_custom_objects().update({'sin': Activation(sin)})

然而，本文中的激活函数有 3 个独特的属性：

它将输入的大小加倍（输出是输入的 2 倍）
参数化了
它的参数应该是正则化的

我想一旦我有了处理上述 3 个问题的框架，我就可以自己算出数学，但我会寻求任何我能得到的帮助！

【问题讨论】：

你能详细说明你的激活应该做什么吗？（它有可训练的权重吗？）
@DanielMöller 是的，它具有可训练的权重。它是一种在线性、对数、指数和正弦输出之间变化的激活，具体取决于可训练权重的值。
那么你将不得不创建一个自定义类...keras.io/layers/writing-your-own-keras-layers
@DanielMöller 我实际上发现有人试图将软指数激活作为一个层在这里实现：github.com/fchollet/keras/issues/3842。知道他们哪里出错了吗？

标签： keras

【解决方案1】：

在这里，我们将需要以下两者之一：

Lambda 层 - 如果您的参数不可训练（您不希望它们随着反向传播而改变）
自定义层 - 如果您需要自定义可训练参数。

Lambda 层：

如果您的参数不可训练，您可以为 lambda 层定义函数。该函数接受一个输入张量，它可以返回任何你想要的：

import keras.backend as K

def customFunction(x):

    #x can be either a single tensor or a list of tensors
    #if a list, use the elements x[0], x[1], etc.

    #Perform your calculations here using the keras backend
    #If you could share which formula exactly you're trying to implement, 
        #it's possible to make this answer better and more to the point    

    #dummy example
    alphaReal = K.variable([someValue])    
    alphaImag = K.variable([anotherValue]) #or even an array of values   

    realPart = alphaReal * K.someFunction(x) + ... 
    imagPart = alphaImag * K.someFunction(x) + ....

    #You can return them as two outputs in a list (requires the fuctional API model
    #Or you can find backend functions that join them together, such as K.stack

    return [realPart,imagPart]

    #I think the separate approach will give you a better control of what to do next.

关于您能做什么，请探索backend functions。

对于参数，您可以将它们定义为 keras 常量或变量（K.constant 或 K.variable），在上述函数内部或外部，甚至可以在模型输入中转换它们。 See details in this answer

在您的模型中，您只需添加一个使用该函数的 lambda 层。

在顺序模型中：model.add(Lambda(customFunction, output_shape=someShape))
在功能 API 模型中：output = Lambda(customFunction, ...)(inputOrListOfInputs)

如果您要向函数传递更多输入，则需要函数模型 API。
如果您使用的是 Tensorflow，则 output_shape 将自动计算，我相信只有 Theano 需要它。（不确定 CNTK）。

自定义层：

自定义层是您创建的新类。仅当您要在函数中具有可训练的参数时，才需要这种方法。（如：用反向传播优化 alpha）

Keras teaches it here.

基本上，您有一个传递常量参数的__init__ 方法，一个创建可训练参数（权重）的build 方法，一个将进行计算的call 方法（确切地说是什么lambda 层（如果您没有可训练的参数）和 compute_output_shape 方法，以便您可以告诉模型输出形状是什么。

class CustomLayer(Layer):

    def __init__(self, alphaReal, alphaImag):

        self.alphaReal = alphaReal    
        self.alphaImage = alphaImag

    def build(self,input_shape):

        #weights may or may not depend on the input shape
        #you may use it or not...

        #suppose we want just two trainable values:
        weigthShape = (2,)

        #create the weights:
        self.kernel = self.add_weight(name='kernel', 
                                  shape=weightShape,
                                  initializer='uniform',
                                  trainable=True)

        super(CustomLayer, self).build(input_shape)  # Be sure to call this somewhere!

    def call(self,x):

        #all the calculations go here:

        #dummy example using the constant inputs
        realPart = self.alphaReal * K.someFunction(x) + ... 
        imagPart = self.alphaImag * K.someFunction(x) + ....

        #dummy example taking elements of the trainable weights
        realPart = self.kernel[0] * realPart    
        imagPart = self.kernel[1] * imagPart

        #all the comments for the lambda layer above are valid here

        #example returning a list
        return [realPart,imagPart]

    def compute_output_shape(self,input_shape):

        #if you decide to return a list of tensors in the call method, 
        #return a list of shapes here, twice the input shape:
        return [input_shape,input_shape]    

        #if you stacked your results somehow in a single tensor, compute a single tuple, maybe with an additional dimension equal to 2:
        return input_shape + (2,)

【讨论】：

【解决方案2】：

你需要实现一个“层”，而不是普通的激活函数。

我认为在 Keras 中实现 pReLU 将是您任务的一个很好的例子。见pReLU

【讨论】：

【解决方案3】：

激活中的 lambda 函数对我有用。也许不是你想要的，但它比简单地使用内置激活函数要复杂一步。

encoder_outputs = Dense(units=latent_vector_len, activation=k.layers.Lambda(lambda z: k.backend.round(k.layers.activations.sigmoid(x=z))), kernel_initializer="lecun_normal")(x)

此代码将 Dense 的输出从 Reals 更改为 0,1（即二进制）。

Keras 发出警告，但代码仍然有效。

【讨论】：