【问题标题】:Keras Custom Layer ValueError: An operation has `None` for gradient.Keras 自定义层 ValueError:一个操作对渐变有“无”。
【发布时间】:2018-12-20 15:06:33
【问题描述】:

我创建了一个自定义 Keras Conv2D 层,如下所示:

class CustConv2D(Conv2D):

    def __init__(self, filters, kernel_size, kernelB=None, activation=None, **kwargs): 
        self.rank = 2
        self.num_filters = filters
        self.kernel_size = conv_utils.normalize_tuple(kernel_size, self.rank, 'kernel_size')
        self.kernelB = kernelB
        self.activation = activations.get(activation)

        super(CustConv2D, self).__init__(self.num_filters, self.kernel_size, **kwargs)

    def build(self, input_shape):
        if K.image_data_format() == 'channels_first':
            channel_axis = 1
        else:
            channel_axis = -1
        if input_shape[channel_axis] is None:
            raise ValueError('The channel dimension of the inputs '
                     'should be defined. Found `None`.')

        input_dim = input_shape[channel_axis]
        num_basis = K.int_shape(self.kernelB)[-1]

        kernel_shape = (num_basis, input_dim, self.num_filters)

        self.kernelA = self.add_weight(shape=kernel_shape,
                                      initializer=RandomUniform(minval=-1.0, 
                                      maxval=1.0, seed=None),
                                      name='kernelA',
                                      regularizer=self.kernel_regularizer,
                                      constraint=self.kernel_constraint)

        self.kernel = K.sum(self.kernelA[None, None, :, :, :] * self.kernelB[:, :, :, None, None], axis=2)

        # Set input spec.
        self.input_spec = InputSpec(ndim=self.rank + 2, axes={channel_axis: input_dim})
        self.built = True
        super(CustConv2D, self).build(input_shape)

我使用 CustomConv2D 作为模型的第一个 Conv 层。

img = Input(shape=(width, height, 1))
l1 = CustConv2D(filters=64, kernel_size=(11, 11), kernelB=basis_L1, activation='relu')(img)

模型编译良好;但在训练时给了我以下错误。

ValueError: 一个操作有None 用于梯度。请确保您的所有操作都定义了渐变(即可微分)。没有梯度的常见操作:K.argmax、K.round、K.eval。

有没有办法找出哪个操作引发了错误?另外,我编写自定义层的方式是否有任何实现错误?

【问题讨论】:

    标签: python tensorflow keras


    【解决方案1】:

    您正在通过调用原始 Conv2D 构建来破坏您的构建(您的 self.kernel 将被替换,然后 self.kernelA 将永远不会被使用,因此反向传播永远不会到达它)。

    它也期待偏见和所有常规的东西:

    class CustConv2D(Conv2D):
    
        def __init__(self, filters, kernel_size, kernelB=None, activation=None, **kwargs): 
    
            #...
            #...
    
            #don't use bias if you're not defining it:
            super(CustConv2D, self).__init__(self.num_filters, self.kernel_size, 
                  activation=activation,
                  use_bias=False, **kwargs)
    
            #bonus: don't forget to add the activation to the call above
            #it will also replace all your `self.anything` defined before this call   
    
    
        def build(self, input_shape):
    
            #...
            #...
    
            #don't use bias:
            self.bias = None
    
            #consider the layer built
            self.built = True
    
            #do not destroy your build
            #comment: super(CustConv2D, self).build(input_shape)
    

    【讨论】:

      【解决方案2】:

      这可能是因为您的代码中有一些权重定义为未用于计算输出。因此,它的梯度 wrt 是无/未定义的。

      可以在此处找到编码示例:https://github.com/keras-team/keras/issues/12521#issuecomment-496743146

      【讨论】:

        猜你喜欢
        • 2019-08-06
        • 2020-05-28
        • 2021-08-02
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-11-01
        • 1970-01-01
        • 2019-02-10
        相关资源
        最近更新 更多