【问题标题】:Initial weights in Network with Keras / Tensorflow使用 Keras / Tensorflow 的网络中的初始权重
【发布时间】:2021-10-13 08:38:07
【问题描述】:

我正在尝试获取给定网络的初始权重。

这个帖子建议需要指定输入维度:How to view initialized weights (i.e. before training)?

这个帖子建议编译后权重应该可用:Reset weights in Keras layer

在编译模型之后但在训练之前保存初始权重。

虽然我在第一篇文章中复制了结果,但我无法将其应用于我的案例:

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf

# Reproducing https://stackoverflow.com/questions/46798708/how-to-view-initialized-weights-i-e-before-training

# First model without input_dim prints an empty list
model = Sequential()
model.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
                activation='relu'))
print(model.get_weights())

# Second model with input_dim prints the assigned weights
model1 = Sequential()
model1.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
                 input_dim=3,
                 activation='relu'))
model1.add(Dense(1, activation='sigmoid'))

print(model1.get_weights())


class Test(tf.keras.Model):
    def __init__(self, n_inputs: int, neurons=10):
        super(Test, self).__init__(name="Test")
        self.neurons = neurons

        # Initilializers
        mean, std = 0., 0.0005
        bias_normalization = None
        kernel_initializer = tf.keras.initializers.RandomNormal(mean=mean,
                                                                stddev=std)
        self.h1 = Dense(n_inputs, activation="linear", name="h1",
                        kernel_initializer=kernel_initializer,
                        bias_initializer=bias_normalization,
                        input_dim=n_inputs)

    def call(self, inputs):
        x = self.h1(inputs)
        return x


# Model Test

test = Test(n_inputs=1, neurons=100)
test.get_weights()  # empty, expected

test.compile()
test.get_weights()  # empty, unexpected

【问题讨论】:

  • 第二个建议有什么问题?立即保存权重?

标签: python tensorflow machine-learning keras deep-learning


【解决方案1】:

在您的情况下,我认为这完全取决于何时实际调用 tf.keras.Modelcall 方法。此外,Keras 顺序模型和子类模型的行为不同。

您的模型的权重仅在您传递真实数据或显式调用build(*) 时创建。例如,如果您尝试以下操作,您将获得一些权重的输出:

test_model = Test(n_inputs=1, neurons=100)
test_model(np.random.random((32, 1)))
print(test_model.get_weights())
# [array([[0.00057544]]), array([0.3752869])]

test_model.build(input_shape=(32, 1))
print(test_model.get_weights())
# [array([[8.942684e-05]], dtype=float32), array([-1.6799461], dtype=float32)]

基本上,call 方法在内部调用__call__ 方法。如果您查看官方的 Tensorflow website,您可以了解此行为:

要在输入上调用模型,请始终使用 call 方法,即 model(inputs),它依赖于底层调用方法。

您也可以按如下方式定义您的 Test 类:

class Test(tf.keras.Model):
    def __init__(self, n_inputs: int, neurons=10):
        super(Test, self).__init__(name="Test")
        self.neurons = neurons

        # Initilializers
        mean, std = 0., 0.0005
        bias_normalization = None
        kernel_initializer = tf.keras.initializers.RandomNormal(mean=mean,
                                                           stddev=std)
        model_input = tf.keras.layers.Input(shape=(n_inputs,))
        x= tf.keras.layers.Dense(n_inputs, activation="linear", name="h1",
                        kernel_initializer=kernel_initializer,
                        bias_initializer=bias_normalization)(model_input)
        self.model = tf.keras.Model(model_input, x)
    
test_model = Test(n_inputs=1, neurons=100)
print(test_model.get_weights())
# [array([[0.00045629]], dtype=float32), array([0.9945322], dtype=float32)]

【讨论】:

  • 谢谢。子类化时,有没有办法在不显式使用模型的情况下构建模型?为什么以下不起作用?添加 input = Input(shape=(n_inputs, )) self.h1 = Dense(n_inputs, activation="linear", name="h1", kernel_initializer=kernel_initializer, bias_initializer=bias_normalization, input_shape=(n_inputs,)) 然后调用编译()。这不是建立模型吗?谢谢!
  • 说实话,我从来没有尝试过这个选项
【解决方案2】:

不知道你为什么选择负担在这里定义一个特定的类,但是你的Test类没有定义一个模型,只有一个Dense层;因此,你最终没有得到任何重量并不奇怪。将self.h1 更改为:

        self.h1 = Sequential(Dense(n_inputs, activation="linear", name="h1",
                        kernel_initializer=kernel_initializer,
                        bias_initializer=bias_normalization,
                        input_dim=n_inputs))

在这两种情况下都可以解决问题:

test = Test(n_inputs=1, neurons=100)
test.get_weights()  
# [array([[-0.00030265]], dtype=float32), array([-1.6327941], dtype=float32)]

test.compile()
test.get_weights()  # same weights as above
# [array([[-0.00030265]], dtype=float32), array([-1.6327941], dtype=float32)]

请注意,随着框架的发展,不同版本似乎有所变化,因此旧线程的一些细节今天可能有所不同。例如,在 Google Colab (2.6.0) 目前使用的 Keras 版本中,如果未定义 input_dimmodel.get_weights() 将抛出错误 - 它不会返回一个空列表: p>

from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import keras

keras.__version__
# '2.6.0'

model = Sequential()
model.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
                activation='relu'))
print(model.get_weights())

这给出了:

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-4-1f22e9c3ce78> in <module>()
      2 model.add(Dense(5, weights=[np.ones((3, 5)), np.zeros(5)],
      3                 activation='relu'))
----> 4 print(model.get_weights())

5 frames

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in get_weights(self)
   2090     """
   2091     with self.distribute_strategy.scope():
-> 2092       return super(Model, self).get_weights()
   2093 
   2094   def save(self,

/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in get_weights(self)
   1844         Weights values as a list of NumPy arrays.
   1845     """
-> 1846     weights = self.weights
   1847     output_weights = []
   1848     for weight in weights:

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in weights(self)
   2488       A list of variables.
   2489     """
-> 2490     return self._dedup_weights(self._undeduplicated_weights)
   2491 
   2492   @property

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in _undeduplicated_weights(self)
   2493   def _undeduplicated_weights(self):
   2494     """Returns the undeduplicated list of all layer variables/weights."""
-> 2495     self._assert_weights_created()
   2496     weights = []
   2497     for layer in self._self_tracked_trackables:

/usr/local/lib/python3.7/dist-packages/keras/engine/sequential.py in _assert_weights_created(self)
    465     # When the graph has not been initialized, use the Model's implementation to
    466     # to check if the weights has been created.
--> 467     super(functional.Functional, self)._assert_weights_created()  # pylint: disable=bad-super-call
    468 
    469 

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in _assert_weights_created(self)
   2672                        'Weights are created when the Model is first called on '
   2673                        'inputs or `build()` is called with an `input_shape`.' %
-> 2674                        self.name)
   2675 
   2676   def _check_call_args(self, method_name):

ValueError: Weights for model sequential_1 have not yet been created. Weights are created when the Model is first called on inputs or `build()` is called with an `input_shape`.

最后,作为一般说明,您应该小心不要混合使用kerastf.keras 的功能;这些是不同的库,因此请选择一个并在整个代码中始终如一地使用它。

【讨论】:

  • 谢谢。在这一点上仍然很绿色。子类化的原因是因为我从另一个来源继承了代码。该类本身将构建一个多输出模型,但应该能够通过功能 API 创建模型,而无需子类化。由于上述相同的原因,我没有注意到不同库的使用。感谢您指出。
  • 当你说它没有定义模型时,你能澄清一下吗? Sequential 也继承自 Model。这将给出一个真实的:assert isinstance(test, tf.keras.Model)
猜你喜欢
  • 2019-10-09
  • 2017-06-17
  • 1970-01-01
  • 2011-12-09
  • 2020-02-08
  • 1970-01-01
  • 2018-10-15
  • 2017-09-15
  • 1970-01-01
相关资源
最近更新 更多