错误：预期 conv3d_1_input 有 5 个维度，但得到了形状为 (10, 224, 224, 3) 的数组答案

【问题标题】：Error: expected conv3d_1_input to have 5 dimensions, but got array with shape (10, 224, 224, 3)错误：预期 conv3d_1_input 有 5 个维度，但得到了形状为 (10, 224, 224, 3) 的数组
【发布时间】：2020-01-13 19:37:08
【问题描述】：

我正在尝试在数据集上训练神经网络以进行活性反欺骗。我在名为genuine 和fake 的两个文件夹中有一些视频。我提取了每个视频的 10 帧，并将它们保存在两个文件夹中，并在新目录 tarining 下使用上述名称。

--/training/
----/genuine/   #containes 10frame*300videos=3000images
----/fake/   #containes 10frame*800videos=8000images

我第一次尝试使用 Keras 设计了以下 3D Convent，但是在运行它之后，它抛出了以下异常：

from keras.preprocessing.image import ImageDataGenerator
from keras import Model, optimizers, activations, losses, regularizers, backend, Sequential
from keras.layers import Dense, MaxPooling3D, AveragePooling3D, Conv3D, Input, Flatten, BatchNormalization

BATCH_SIZE = 10
TARGET_SIZE = (224, 224)

train_datagen = ImageDataGenerator(rescale=1.0/255,
                                   data_format='channels_last',
                                   validation_split=0.2,
                                   shear_range=0.0,
                                   zoom_range=0,
                                   horizontal_flip=False,
                                   featurewise_center=False,
                                   featurewise_std_normalization=False,
                                   width_shift_range=False,
                                   height_shift_range=False)

train_generator = train_datagen.flow_from_directory("./training/",
                                                    target_size=TARGET_SIZE,
                                                    batch_size=BATCH_SIZE,
                                                    class_mode='binary',
                                                    shuffle=False,
                                                    subset='training')

validation_generator = train_datagen.flow_from_directory("./training/",
                                                    target_size=TARGET_SIZE,
                                                    batch_size=BATCH_SIZE,
                                                    class_mode='binary',
                                                    shuffle=False,
                                                    subset='validation')

SHAPE = (10, 224, 224, 3)
model = Sequential()
model.add(Conv3D(filters=128, kernel_size=(1, 3, 3), data_format='channels_last', activation='relu', input_shape=(10, 224, 224, 3)))
model.add(MaxPooling3D(data_format='channels_last', pool_size=(1, 2, 2)))
model.add(Conv3D(filters=64, kernel_size=(2, 3, 3), activation='relu'))
model.add(MaxPooling3D(pool_size=(1, 2, 2)))
model.add(Conv3D(filters=32, kernel_size=(2, 3, 3), activation='relu'))
model.add(Conv3D(filters=32, kernel_size=(2, 3, 3), activation='relu'))
model.add(MaxPooling3D(pool_size=(1, 2, 2)))
model.add(Conv3D(filters=16, kernel_size=(2, 3, 3), activation='relu'))
model.add(Conv3D(filters=16, kernel_size=(2, 3, 3), activation='relu'))
model.add(AveragePooling3D())
model.add(BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer=optimizers.adam(), loss=losses.binary_crossentropy, metrics=['accuracy'])
model.fit_generator(train_generator, steps_per_epoch=train_generator.samples/train_generator.batch_size, epochs=5, validation_data=validation_generator, validation_steps=validation_generator.samples/validation_generator.batch_size)
model.save('3d.h5')

这是错误：

ValueError: Error when checking input: expected conv3d_1_input to have 5 dimensions, but got array with shape (10, 224, 224, 3)

这是model.summary()的输出

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv3d_1 (Conv3D)            (None, 10, 222, 222, 128) 3584      
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 10, 111, 111, 128) 0         
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 9, 109, 109, 64)   147520    
_________________________________________________________________
max_pooling3d_2 (MaxPooling3 (None, 9, 54, 54, 64)     0         
_________________________________________________________________
conv3d_3 (Conv3D)            (None, 8, 52, 52, 32)     36896     
_________________________________________________________________
conv3d_4 (Conv3D)            (None, 7, 50, 50, 32)     18464     
_________________________________________________________________
max_pooling3d_3 (MaxPooling3 (None, 7, 25, 25, 32)     0         
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 6, 23, 23, 16)     9232      
_________________________________________________________________
conv3d_6 (Conv3D)            (None, 5, 21, 21, 16)     4624      
_________________________________________________________________
average_pooling3d_1 (Average (None, 2, 10, 10, 16)     0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 2, 10, 10, 16)     64        
_________________________________________________________________
dense_1 (Dense)              (None, 2, 10, 10, 32)     544       
_________________________________________________________________
dense_2 (Dense)              (None, 2, 10, 10, 1)      33        
=================================================================
Total params: 220,961
Trainable params: 220,929
Non-trainable params: 32
__________________________________________________________

对于修复异常的任何帮助，我将不胜感激。顺便说一句，如果它有助于解决问题，我将使用 TensorFlow 作为后端。

【问题讨论】：

错误不是说明了一切吗？ conv3d_1 需要 (None, 10, 222, 222, 128) 输入。但是您正在尝试提供10, 224, 224, 3 输入。如果您想输入单个数据点，则需要重新调整数据（例如 np.expand_dims(input, 0)) 使其大小为 [1, 10, 222, 222, 128]。
@thushv89 为什么会是（无、10、222、222、128）？正如Doc所说，第一层Conv3D将输入作为：“当将此层用作模型中的第一层时，提供关键字参数input_shape（整数元组，不包括批处理轴），例如input_shape =（128、128， 128, 1) 用于单通道的 128x128x128 卷，data_format="channels_last"。"这是视频的格式（帧、高度、宽度、通道）。我错了吗？
使用 ImageDataGenerator 和 flow_from_directory 时如何重塑数据？ @thushv89
@Mohommad 我想你误会了。在设置模型时这样做（即忽略批量维度）。但是在传递数据的时候，不能忽略batch维度。实际数据需要具有该维度。
好吧，问题是我不认为imagedatagenerator 可能不是处理视频数据issue 的最佳方法。因为我认为您不能轻松添加缺少的批次维度。因此，您可能希望通过使用 __getitem__() 进行批处理来操作数据。

标签： tensorflow keras deep-learning conv-neural-network

【解决方案1】：

正如 @thushv89 在 cmets 中提到的，Keras 没有内置视频生成器，这会给那些将使用大型视频数据集的人带来很多问题。因此，我编写了一个简单的 VideoDataGenerator，它的工作原理几乎与 ImageDataGenerator 一样简单。该脚本可以在here on my github 找到，以防将来有人需要它。

【讨论】：