Keras，可以处理输入的数据适配器：<class 'function'>, <class 'NoneType'>" in Batch Training答案

【问题标题】：Keras, data adapter that can handle input: <class 'function'>, <class 'NoneType'> " in Batch TrainingKeras，可以处理输入的数据适配器：<class 'function'>, <class 'NoneType'>" in Batch Training
【发布时间】：2021-01-11 12:36:23
【问题描述】：

我正在尝试批量训练我的模型，因为我的数据集非常大。但是调用时

autoencoder_train = autoencoder.fit(my_training_batch_generator, 
                                    steps_per_epoch=steps_per_epoch, 
                                    epochs=nb_epoch,
                                    verbose=1, 
                                    validation_data=my_testing_batch_generator,
                                    validation_steps=validation_steps)

我收到以下错误：

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py in select_data_adapter(x, y)
    962         "Failed to find data adapter that can handle "
    963         "input: {}, {}".format(
--> 964             _type_name(x), _type_name(y)))
    965   elif len(adapter_cls) > 1:
    966     raise RuntimeError(

ValueError: Failed to find data adapter that can handle input: <class 'function'>, <class 'NoneType'>

函数my_training_batch_generator 和my_testing_batch_generator 定义相同：

def my_training_batch_generator(Train_df,batch_size,
                    steps):
    idx=1
    while True: 
        yield load_train_data(Train_df,idx-1,batch_size)## Yields data
        
        if idx<steps:
            idx+=1
        else:
            idx=1


dataDir = "/..."
def load_train_data(Train_df,idx,
              batch_size):
  i = 1
  x = np.zeros([batch_size, 100, 100, 100, 3])
  for n in range(idx*batch_size, idx*batch_size + batch_size):
    data = loadmat( Train_df+'volume'+str(n))  
    x[i] = np.array(data['tensor'])
    i = i + 1
  return (np.asarray(x),np.asarray(x))

所以我很确定 generator 函数将 numpy 数组传递给自动编码器，因此我不明白为什么数据适配器无法处理输入？我是批处理训练的新手，我遵循的教程 (here) 用于分类任务，而在这里我通过自动编码器在图像到图像回归上使用它。任何帮助将不胜感激！

【问题讨论】：

标签： python numpy tensorflow keras autoencoder

【解决方案1】：

我无法重现该问题，因此我将分享我为生成器训练所做的工作。

首先，我建议您尝试在训练循环之外打印生成器的输出。检查形状是否与模型的输入相匹配。

第二件事是将函数对象传递给 fit 方法。我不知道这种语法是否会起作用（事实上 keras 抱怨“功能”类型。

希望这对我有用（批量大小为 1）（tf 2.0）

def generate_data():
i = -1
while True:
    i += 1  
    if i == len(x_train): i = 0
        
    #print(x_train[i], y_train[i])
    #print(x_train[i].shape, y_train[i].shape)
    
    yield x_train[i], y_train[i]

 
def generate_val():

i = -1
while True:
    i += 1  
    if i == len(x_test): i = 0
    #print(x_test[i], y_test[i])
    #print(x_test[i].shape, y_test[i].shape)
    
    yield x_test[i], y_test[i]


#....model definition and so on ...

history = model.fit(generate_data(), steps_per_epoch=len(x_train), epochs=100, 
callbacks = [callback],class_weight={0:4, 1:1}, 
validation_data=generate_val(), validation_steps=len(x_test))

【讨论】：

您也可以尝试调用 autoencoder.fit(my_training_batch_generator(train_df, bla, bla), .... ) 将所需的参数传递给函数，但是我看不到您明确需要 Train_df 的位置。 ..
我无法打印生成器的输出类型，因为代码在编译和运行之前中断。您的代码中的 x_test 是什么？在我看来，您已经将实际的数组存储在内存中，对吗？我需要批量训练才能真正单独加载数据
嗨，是的，我已将完整的数据集加载到内存中，但这并不重要。如果您不能单独使用生成器，那么它为什么要在适合程序中工作？据我了解，您有一个 load_train_data 函数，该函数从索引为 n 的文件中返回红色值。这应该作为一个独立的功能。是这样吗？

【解决方案2】：

问题有两个：

最重要的是，我在定义模型之前忘记定义生成器对象，如下所示：

my_training_batch_generator = batch_generator(Train_df, 256, steps_per_epoch)
my_testing_batch_generator = batch_generator(Test_df, 256, validation_steps)

原因 2) 我发现自己有两个生成器和加载函数（一个用于训练，一个用于测试）以及为什么我收到错误“无法处理类“函数””：我正在将一个函数传递给自动编码器而不是生成器对象。

【讨论】：