【问题标题】:Error: "Unable to create link (name already exists) " when saving a whole model consisting of two identical pretrained models保存由两个相同的预训练模型组成的整个模型时出现错误:“无法创建链接(名称已存在)”
【发布时间】:2021-05-06 00:32:03
【问题描述】:

我有一个简单的 keras 模型,它由两个相同的预训练模型 (EfficientB2) 组成。

当我想保存整个模型(具有优化器状态的权重)时,会出现以下错误:

Found 964 validated image filenames belonging to 2 classes.
Found 964 validated image filenames belonging to 2 classes.
Epoch 1/2
120/120 [==============================] - ETA: 0s - loss: 13.4150 - accuracy: 0.5128Found 297 validated image filenames belonging to 2 classes.
Found 297 validated image filenames belonging to 2 classes.
120/120 [==============================] - 90s 466ms/step - loss: 13.4148 - accuracy: 0.5127 - val_loss: 13.7815 - val_accuracy: 0.4626

Epoch 00001: saving model to /content/drive/My Drive/web_crawling/weightst-01-0.4626-13.7815.hdf5
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-132-adc5fc37359d> in <module>()
      1 # model.fit([x_train1,x_train2],y_train,batch_size=4,epochs=10,validation_split=0.1,shuffle=True,callbacks=callbacks_list)
----> 2 model.fit(train_generator,epochs=2,steps_per_epoch = tr_sample // batch_size, validation_data = validation_generator,validation_steps = val_sample // batch_size,callbacks=callbacks_list)#,class_weight=class_weight)

9 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1143           epoch_logs.update(val_logs)
   1144 
-> 1145         callbacks.on_epoch_end(epoch, epoch_logs)
   1146         training_logs = epoch_logs
   1147         if self.stop_training:

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/callbacks.py in on_epoch_end(self, epoch, logs)
    426     for callback in self.callbacks:
    427       if getattr(callback, '_supports_tf_logs', False):
--> 428         callback.on_epoch_end(epoch, logs)
    429       else:
    430         if numpy_logs is None:  # Only convert once.

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/callbacks.py in on_epoch_end(self, epoch, logs)
   1342     # pylint: disable=protected-access
   1343     if self.save_freq == 'epoch':
-> 1344       self._save_model(epoch=epoch, logs=logs)
   1345 
   1346   def _should_save_on_batch(self, batch):

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/callbacks.py in _save_model(self, epoch, logs)
   1406                 filepath, overwrite=True, options=self._options)
   1407           else:
-> 1408             self.model.save(filepath, overwrite=True, options=self._options)
   1409 
   1410         self._maybe_remove_file()

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py in save(self, filepath, overwrite, include_optimizer, save_format, signatures, options, save_traces)
   2000     # pylint: enable=line-too-long
   2001     save.save_model(self, filepath, overwrite, include_optimizer, save_format,
-> 2002                     signatures, options, save_traces)
   2003 
   2004   def save_weights(self,

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/saving/save.py in save_model(model, filepath, overwrite, include_optimizer, save_format, signatures, options, save_traces)
    152           'or using `save_weights`.')
    153     hdf5_format.save_model_to_hdf5(
--> 154         model, filepath, overwrite, include_optimizer)
    155   else:
    156     saved_model_save.save(model, filepath, overwrite, include_optimizer,

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/saving/hdf5_format.py in save_model_to_hdf5(model, filepath, overwrite, include_optimizer)
    129     if (include_optimizer and model.optimizer and
    130         not isinstance(model.optimizer, optimizer_v1.TFOptimizer)):
--> 131       save_optimizer_weights_to_hdf5_group(f, model.optimizer)
    132 
    133     f.flush()

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/saving/hdf5_format.py in save_optimizer_weights_to_hdf5_group(hdf5_group, optimizer)
    594     for name, val in zip(weight_names, weight_values):
    595       param_dset = weights_group.create_dataset(
--> 596           name, val.shape, dtype=val.dtype)
    597       if not val.shape:
    598         # scalar

/usr/local/lib/python3.7/dist-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds)
    137             dset = dataset.Dataset(dsid)
    138             if name is not None:
--> 139                 self[name] = dset
    140             return dset
    141 

/usr/local/lib/python3.7/dist-packages/h5py/_hl/group.py in __setitem__(self, name, obj)
    371 
    372             if isinstance(obj, HLObject):
--> 373                 h5o.link(obj.id, self.id, name, lcpl=lcpl, lapl=self._lapl)
    374 
    375             elif isinstance(obj, SoftLink):

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5o.pyx in h5py.h5o.link()

RuntimeError: Unable to create link (name already exists)

这是我的代码:

new_input1 = Input(shape=(224, 224, 3))
new_input2 = Input(shape=(224, 224, 3))

effB3_1 = EfficientNetB2(include_top=False,weights='imagenet')
effB3_1._name="bf1"
effB3_11=effB3_1(new_input1)
gap_1=GlobalMaxPooling2D(name="gap_1")(effB3_11)

effB3_2 = EfficientNetB2(include_top=False,weights='imagenet')
effB3_2._name="bf2"
effB3_22=effB3_2(new_input2)
gap_2=GlobalMaxPooling2D(name="gap_2")(effB3_22)

merge=concatenate([gap_1,gap_2])
dense1=Dense(64, activation='relu', name="fc2", kernel_regularizer=l1(0.001),bias_regularizer=l1(0.001))(merge)
output=Dense(2, activation='softmax', name="fc_out")(dense1)
model=Model([new_input1,new_input2],output)

model.compile(loss='categorical_crossentropy',
              optimizer=sc,
              metrics=["accuracy"])
filepath="path" 
checkpoint=ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, mode='auto',save_weights_only=False)
callbacks_list = [checkpoint]

model.fit(train_generator,epochs=20,steps_per_epoch = tr_sample // batch_size, validation_data = validation_generator,validation_steps = val_sample // batch_size,callbacks=callbacks_list)

如果我将 save_weights_only 更改为 True 一切都很好。我知道这个问题与保存优化器参数有关,但我不知道如何修复这个错误并保存整个模型?

P.S:我还在编译之前为完整模型和模型之一(effb3_1)编写了以下代码,但它没有解决问题。

for i in range(len(model.weights)):
    model.weights[i]._handle_name = model.weights[i].name + "_" + str(i)

谷歌协作 TF 2.4

【问题讨论】:

    标签: python tensorflow keras h5py


    【解决方案1】:

    不仅要更改权重的名称,还要更改包括偏差在内的所有变量。

    for v in model.variables:
        v._handle_name = v.name + '_'
    

    【讨论】:

    • 我做到了,但错误没有解决。我认为这个错误与重量和偏差无关。因为当“save_weights_only”设置为 true 时,一切正常。我猜这个错误是由于优化器参数名称造成的。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-11-28
    • 1970-01-01
    • 2017-10-15
    • 2018-03-08
    • 1970-01-01
    • 2021-02-08
    • 2021-03-30
    相关资源
    最近更新 更多