【问题标题】:Custom generator runs out of data even when steps_per_epoch specified即使指定了steps_per_epoch,自定义生成器也会耗尽数据
【发布时间】:2020-12-13 13:59:31
【问题描述】:

我正在使用自定义生成器训练模型,但在完成第一个 epoch 之前,模型的数据用完了。它给了我以下错误:

Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least (steps_per_epoch * epochs) batches (in this case, 8740 batches). You may need to use the repeat() function when building your dataset

我有四个生成器(一个用于火车数据,另一个用于火车标签。与验证相同)。然后我将火车和标签拉在一起。这是我的发电机的原型。 I got the idea from here:

import numpy as np
import nibabel as nib
from tensorflow import keras
import os

def weirddivision(n,d):
    return np.array(n)/np.array(d) if d else 0

class ImgDataGenerator(keras.utils.Sequence):

    def __init__(self, file_list, batch_size=8, shuffle=True):
        """Constructor can be expanded,
           with batch size, dimentation etc.
        """
        self.file_list = file_list
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.on_epoch_end()
        

    def __len__(self):
        'Take all batches in each iteration'
        return int(np.floor(len(self.file_list) / self.batch_size))

    def __getitem__(self, index):
        'Get next batch'
        # Generate indexes of the batch
        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        # single file
        file_list_temp = [self.file_list[k] for k in indexes]

        # Set of X_train and y_train
        X = self.__data_generation(file_list_temp)

        return X

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        self.indexes = np.arange(len(self.file_list))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)

    def __data_generation(self, file_list_temp):
        'Generates data containing batch_size samples'
        train_loc = '/home/faruk/Desktop/BrainSeg/Dataset/Train/'
        X = np.empty((self.batch_size,224,256,1))
        # Generate data
        for i, ID in enumerate(file_list_temp):
            x_file_path = os.path.join(train_loc, ID)
            img = np.load(x_file_path)
            img = np.pad(img, pad_width=((14,13),(12,11)), mode='constant')
            img = np.expand_dims(img,-1)
            img = weirddivision(img, img.max())

            # Store sample
            X[i,] = img


        return X

如前所述,我在这里创建了四个生成器并将它们压缩:

training_img_generator = ImgDataGenerator(train)
training_label_generator = LabelDataGenerator(train)
train_generator = zip(training_img_generator,training_label_generator)

val_img_generator = ValDataGenerator(val)
val_label_generator = ValLabelDataGenerator(val)
val_generator = zip(val_img_generator,val_label_generator)

因为生成器正在动态生成数据,所以我认为它可能试图生成比实际可用的更多的数据。因此,我计算了每个 epoch 的步数,并将其传递给 fit_generator:

batch_size = 8
spe = len(train)//batch_size # len(train) = 34965
val_spe = len(val)//batch_size # len(val) = 4347

History=model.fit_generator(generator=train_generator, validation_data=val_generator, epochs=2, steps_per_epoch=spe, validation_steps = val_spe, shuffle=True, verbose=1)

但是,这仍然行不通。我尝试减少每个 epoch 的步数,并且我能够完成第一个 epoch,但错误随后出现在第二个 epoch 的开始。显然生成器需要无限重复,但我不知道如何实现这一点。我可以使用无限的while循环吗?如果有,在哪里?

【问题讨论】:

    标签: python tensorflow keras deep-learning conv-neural-network


    【解决方案1】:

    试试这个:

    train_generator = train_generator.repeat()
    val_generator = val_generator.repeat()
    

    【讨论】:

    • 嗨。感谢您的回答。我试过了,但它给了我以下错误:'zip' object has no attribute 'repeat' 我认为这是因为 .repeat() 只能在对象类型为 tf.data 时调用
    【解决方案2】:

    我解决了这个问题。我将我的生成器类定义如下:

    class ImgDataGenerator(keras.utils.Sequence)
    

    但是,我的模型不是连续的……它是功能性的。我通过创建自己的自定义生成器而不从 keras.utils.sequence 继承解决了这个问题。

    我希望这对某人有帮助。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-05-08
      • 2011-08-26
      • 1970-01-01
      • 2021-08-08
      • 2016-03-29
      • 2022-08-04
      • 2022-01-04
      • 2019-11-10
      相关资源
      最近更新 更多