使用 Keras / Tensorflow 的网络中的初始权重答案

【问题标题】：Initial weights in Network with Keras / Tensorflow使用 Keras / Tensorflow 的网络中的初始权重
【发布时间】：2021-10-13 08:38:07
【问题描述】：

我正在尝试获取给定网络的初始权重。

这个帖子建议需要指定输入维度：How to view initialized weights (i.e. before training)?

这个帖子建议编译后权重应该可用：Reset weights in Keras layer

在编译模型之后但在训练之前保存初始权重。

虽然我在第一篇文章中复制了结果，但我无法将其应用于我的案例：

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf

# Reproducing https://stackoverflow.com/questions/46798708/how-to-view-initialized-weights-i-e-before-training

# First model without input_dim prints an empty list
model = Sequential()
model.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
                activation='relu'))
print(model.get_weights())

# Second model with input_dim prints the assigned weights
model1 = Sequential()
model1.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
                 input_dim=3,
                 activation='relu'))
model1.add(Dense(1, activation='sigmoid'))

print(model1.get_weights())


class Test(tf.keras.Model):
    def __init__(self, n_inputs: int, neurons=10):
        super(Test, self).__init__(name="Test")
        self.neurons = neurons

        # Initilializers
        mean, std = 0., 0.0005
        bias_normalization = None
        kernel_initializer = tf.keras.initializers.RandomNormal(mean=mean,
                                                                stddev=std)
        self.h1 = Dense(n_inputs, activation="linear", name="h1",
                        kernel_initializer=kernel_initializer,
                        bias_initializer=bias_normalization,
                        input_dim=n_inputs)

    def call(self, inputs):
        x = self.h1(inputs)
        return x


# Model Test

test = Test(n_inputs=1, neurons=100)
test.get_weights()  # empty, expected

test.compile()
test.get_weights()  # empty, unexpected

【问题讨论】：

第二个建议有什么问题？立即保存权重？

标签： python tensorflow machine-learning keras deep-learning

【解决方案1】：

在您的情况下，我认为这完全取决于何时实际调用 tf.keras.Model 的 call 方法。此外，Keras 顺序模型和子类模型的行为不同。

您的模型的权重仅在您传递真实数据或显式调用build(*) 时创建。例如，如果您尝试以下操作，您将获得一些权重的输出：

test_model = Test(n_inputs=1, neurons=100)
test_model(np.random.random((32, 1)))
print(test_model.get_weights())
# [array([[0.00057544]]), array([0.3752869])]

或

test_model.build(input_shape=(32, 1))
print(test_model.get_weights())
# [array([[8.942684e-05]], dtype=float32), array([-1.6799461], dtype=float32)]

基本上，call 方法在内部调用__call__ 方法。如果您查看官方的 Tensorflow website，您可以了解此行为：

要在输入上调用模型，请始终使用 call 方法，即 model(inputs)，它依赖于底层调用方法。

您也可以按如下方式定义您的 Test 类：

class Test(tf.keras.Model):
    def __init__(self, n_inputs: int, neurons=10):
        super(Test, self).__init__(name="Test")
        self.neurons = neurons

        # Initilializers
        mean, std = 0., 0.0005
        bias_normalization = None
        kernel_initializer = tf.keras.initializers.RandomNormal(mean=mean,
                                                           stddev=std)
        model_input = tf.keras.layers.Input(shape=(n_inputs,))
        x= tf.keras.layers.Dense(n_inputs, activation="linear", name="h1",
                        kernel_initializer=kernel_initializer,
                        bias_initializer=bias_normalization)(model_input)
        self.model = tf.keras.Model(model_input, x)

test_model = Test(n_inputs=1, neurons=100)
print(test_model.get_weights())
# [array([[0.00045629]], dtype=float32), array([0.9945322], dtype=float32)]

【讨论】：

谢谢。子类化时，有没有办法在不显式使用模型的情况下构建模型？为什么以下不起作用？添加 input = Input(shape=(n_inputs, )) self.h1 = Dense(n_inputs, activation="linear", name="h1", kernel_initializer=kernel_initializer, bias_initializer=bias_normalization, input_shape=(n_inputs,)) 然后调用编译（）。这不是建立模型吗？谢谢！
说实话，我从来没有尝试过这个选项

【解决方案2】：

不知道你为什么选择负担在这里定义一个特定的类，但是你的Test类没有定义一个模型，只有一个Dense层；因此，你最终没有得到任何重量并不奇怪。将self.h1 更改为：

        self.h1 = Sequential(Dense(n_inputs, activation="linear", name="h1",
                        kernel_initializer=kernel_initializer,
                        bias_initializer=bias_normalization,
                        input_dim=n_inputs))

在这两种情况下都可以解决问题：

test = Test(n_inputs=1, neurons=100)
test.get_weights()  
# [array([[-0.00030265]], dtype=float32), array([-1.6327941], dtype=float32)]

test.compile()
test.get_weights()  # same weights as above
# [array([[-0.00030265]], dtype=float32), array([-1.6327941], dtype=float32)]

请注意，随着框架的发展，不同版本似乎有所变化，因此旧线程的一些细节今天可能有所不同。例如，在 Google Colab (2.6.0) 目前使用的 Keras 版本中，如果未定义 input_dim，model.get_weights() 将抛出错误 - 它不会返回一个空列表： p>

from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import keras

keras.__version__
# '2.6.0'

model = Sequential()
model.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
                activation='relu'))
print(model.get_weights())

这给出了：

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-4-1f22e9c3ce78> in <module>()
      2 model.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
      3                 activation='relu'))
----> 4 print(model.get_weights())

5 frames

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in get_weights(self)
   2090     """
   2091     with self.distribute_strategy.scope():
-> 2092       return super(Model, self).get_weights()
   2093 
   2094   def save(self,

/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in get_weights(self)
   1844         Weights values as a list of NumPy arrays.
   1845     """
-> 1846     weights = self.weights
   1847     output_weights = []
   1848     for weight in weights:

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in weights(self)
   2488       A list of variables.
   2489     """
-> 2490     return self._dedup_weights(self._undeduplicated_weights)
   2491 
   2492   @property

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in _undeduplicated_weights(self)
   2493   def _undeduplicated_weights(self):
   2494     """Returns the undeduplicated list of all layer variables/weights."""
-> 2495     self._assert_weights_created()
   2496     weights = []
   2497     for layer in self._self_tracked_trackables:

/usr/local/lib/python3.7/dist-packages/keras/engine/sequential.py in _assert_weights_created(self)
    465     # When the graph has not been initialized, use the Model's implementation to
    466     # to check if the weights has been created.
--> 467     super(functional.Functional, self)._assert_weights_created()  # pylint: disable=bad-super-call
    468 
    469 

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in _assert_weights_created(self)
   2672                        'Weights are created when the Model is first called on '
   2673                        'inputs or `build()` is called with an `input_shape`.' %
-> 2674                        self.name)
   2675 
   2676   def _check_call_args(self, method_name):

ValueError: Weights for model sequential_1 have not yet been created. Weights are created when the Model is first called on inputs or `build()` is called with an `input_shape`.

最后，作为一般说明，您应该小心不要混合使用keras 和tf.keras 的功能；这些是不同的库，因此请选择一个并在整个代码中始终如一地使用它。

【讨论】：

谢谢。在这一点上仍然很绿色。子类化的原因是因为我从另一个来源继承了代码。该类本身将构建一个多输出模型，但应该能够通过功能 API 创建模型，而无需子类化。由于上述相同的原因，我没有注意到不同库的使用。感谢您指出。
当你说它没有定义模型时，你能澄清一下吗？ Sequential 也继承自 Model。这将给出一个真实的：assert isinstance(test, tf.keras.Model)