【问题标题】:Setting a keras layer as not trainable after a compile changes the number of total parameters in the summary编译后将 keras 层设置为不可训练会更改摘要中的总参数数量
【发布时间】:2020-08-26 12:02:09
【问题描述】:

我想知道我应该如何解释 keras 库的模型摘要的以下结果。 以下结果来自 keras 版本 2.3.1。

在keras中,我们可以设置layer的trainable属性,使其权重在训练过程中不发生变化。

from keras.models import Sequential
from keras.layers import Dense
model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.summary()
print("***")

model.layers[0].trainable = False
model.summary()
Model: "sequential_36"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_101 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_102 (Dense)            (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential_36"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_101 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_102 (Dense)            (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 6
Non-trainable params: 20

上面的结果很直观,因为我将第一层设置为不可训练,我们的可训练参数较少。

如果我在更改属性之前编译模型(这不是标准的,但在某些应用程序中可能会发生),我会得到以下结果。

model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.compile(loss="mse", optimizer="adam")

model.summary()
print("***")
model.layers[0].trainable = False
model.summary()
Model: "sequential_38"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_105 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_106 (Dense)            (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential_38"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_105 (Dense)            (None, 5)                 20        
_________________________________________________________________
dense_106 (Dense)            (None, 1)                 6         
=================================================================
Total params: 46
Trainable params: 26
Non-trainable params: 20

这表示参数比以前更多。有人可以澄清这些数字应该如何解释吗?

[编辑]

从收到的答案来看,这似乎是一个错误功能,其行为取决于包版本。这是我从 tensorflow keras API 获得的另一个示例。与@lukasz-tracewski 的答案不同,我仍然获得相同数量的参数和不同的警告消息。也许版本略有不同?

import tensorflow as tf
print("tensorflow version is", tf.__version__)
print("keras version is", tf.keras.__version__)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(5, input_dim=3), Dense(1)
])
model.compile(loss="mse", optimizer="adam")

model.summary()
print("***")
model.layers[0].trainable = False
model.summary()
tensorflow version is 2.1.0
keras version is 2.2.4-tf
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 5)                 20        
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 6         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________
***
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 5)                 20        
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 6         
=================================================================
WARNING:tensorflow:Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
Total params: 46
Trainable params: 26
Non-trainable params: 20

【问题讨论】:

    标签: python tensorflow keras


    【解决方案1】:

    这是 keras 中的一个错误功能,这里是 issue。正如您从评论中看到的那样,只需发出一个不一致的警告即可解决:

    UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
      'Discrepancy between trainable weights and collected trainable'
    Total params: 46
    Trainable params: 26
    Non-trainable params: 20
    

    如果您使用一些类似 Jupyter 的解决方案(有时会吃掉警告消息),您可能会错过此警告。

    简而言之,这是由于summary方法会检查编译后的模型,然后将不可训练的参数分开。这就是为什么你会得到26(来自已编译模型中的所有可训练参数)+20(来自稍后检查的不可训练属性)。

    Tensorflow keras API 没有这个错误功能。

    [编辑]

    由于 Tensorflow 与 Keras API 和 Keras 与 Tensorflow 后端之间可能存在一些混淆,因此您将在下面找到前者的代码。它几乎与 OP 提供的相同,只是导入不同。

    from tensorflow.keras.layers import Dense
    from tensorflow.keras import Sequential
    
    model = Sequential([
        Dense(5, input_dim=3), Dense(1)
    ])
    model.compile(loss="mse", optimizer="adam")
    
    model.summary()
    print("***")
    model.layers[0].trainable = False
    model.summary()
    

    输出:

    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    dense_2 (Dense)              (None, 5)                 20        
    _________________________________________________________________
    dense_3 (Dense)              (None, 1)                 6         
    =================================================================
    Total params: 26
    Trainable params: 26
    Non-trainable params: 0
    _________________________________________________________________
    ***
    Model: "sequential_1"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    dense_2 (Dense)              (None, 5)                 20        
    _________________________________________________________________
    dense_3 (Dense)              (None, 1)                 6         
    =================================================================
    Total params: 26
    Trainable params: 6
    Non-trainable params: 20
    

    注意没有警告。

    我的keras版本是2.3.1,而Tensorflow是2.2

    【讨论】:

    • 感谢您的澄清。在带有 keras api 版本 2.2.4-tf 的 tensorflow 2.1.0 中,我得到了相同数量的参数,但警告消息略有不同。你指的是哪个版本的tensorflow?我收到的消息是“WARNING:tensorflow:Discrepancy between trainable weights and collectable weights, did you set model.trainable without call model.compile after ?”
    • @KotaMori keras 测试在 2.3.1 上运行,Tensorflow 2.2 在后端运行。 Tensorflow 测试在2.2 上执行。在这种情况下,我们不能真正谈论 keras 版本,因为这只是位于模块顶部的接口之一。是的,model.trainable 是在没有调用 model.compile 的情况下设置的。请参阅文中的额外说明。额外建议:如果可能的话,将 keras 切换为 tensorflow。主要作者转到 Google 并继续开发(和修复!)tensorflow。
    • 感谢您的建议。我认为这个问题已经得到解答,但只是使用 tensorflow keras API 在问题中添加了另一个示例,我仍然可以获得增加的参数编号和警告。不过,我认为我们不需要确定哪些版本会收到警告,哪些不会。
    • 我猜变化是在 TF 2.1 和 2.2 之间,这里是定义 summary 的地方:github.com/tensorflow/tensorflow/commits/master/tensorflow/… 如果你检查这个最近的提交 github.com/tensorflow/tensorflow/commit/… 它会删除一致性检查,用于计算可训练的参数也不同。
    【解决方案2】:

    我正在使用 TF 2.2

    model = Sequential([
        Dense(5, input_dim=3), Dense(1)
    ])
    model.compile(loss="mse", optimizer="adam")
    
    model.summary()
    print("***")
    model.layers[0].trainable = False
    model.summary()
    

    总结:

    Model: "sequential_1"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    dense_1 (Dense)              (None, 5)                 20        
    _________________________________________________________________
    dense_2 (Dense)              (None, 1)                 6         
    =================================================================
    Total params: 26
    Trainable params: 26
    Non-trainable params: 0
    _________________________________________________________________
    ***
    Model: "sequential_1"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    dense_1 (Dense)              (None, 5)                 20        
    _________________________________________________________________
    dense_2 (Dense)              (None, 1)                 6         
    =================================================================
    Total params: 26
    Trainable params: 6
    Non-trainable params: 20
    

    【讨论】:

    • 谢谢。那么也许这是后期版本引入的一个特性(bug)。
    猜你喜欢
    • 2019-12-25
    • 2019-04-29
    • 2019-09-06
    • 1970-01-01
    • 2018-11-29
    • 2019-02-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多