ImageDataGenerator 的标签形状错误答案

【问题标题】：Wrong label shape with ImageDataGeneratorImageDataGenerator 的标签形状错误
【发布时间】：2020-10-12 04:54:09
【问题描述】：

我正在尝试训练具有不同图像尺寸的模型，通常我会使用 flatten，但 flatten() 期望所有图像的尺寸都是固定的，而我没有。

在这里，我尝试用 GlobalMaxPool2D() 替换 flatten，但最后我遇到了预期尺寸的问题。我是 TensorFlow 的新手，我很难理解我可以在哪里调整我的模型以避免出现预期中的这个问题？

代码：（一些导入是不必要的，但会在后面使用，我添加它们以防假定的不兼容）

from __future__ import print_function
import keras

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D, GlobalMaxPool2D
import os
from random import shuffle

train_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our training data
validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data
batch_size = 128

train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
                                                           directory=f"/kaggle/working",
                                                           shuffle=True,
                                                           class_mode='binary')
val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
                                                           directory=f"/kaggle/working/",
                                                           shuffle=True,
                                                           class_mode='binary')

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(None,None,3))) #We change the input shape because the images have different shapes but always 3 chan.
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

# model.add(Flatten()) #as all the pictures have different size, flatten does not work. Possibly other solutions found there :
model.add(GlobalMaxPool2D())
# https://stackoverflow.com/questions/47795697/how-to-give-variable-size-images-as-input-in-keras
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

# initiate RMSprop optimizer
opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)

# Let's train the model using RMSprop
model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

# X_train_i = X_train_i.astype('float32')
# X_test_i = X_test_i.astype('float32')
X_train_i /= 255
X_test_i /= 255
model.summary()
model.fit_generator(train_data_gen,
        steps_per_epoch=2000,
        epochs=10,
        validation_data=val_data_gen,
        validation_steps=800)
#             batch_size=batch_size,
#             epochs=epochs,
#             validation_data=(X_test_i, y_test),
#             shuffle=True)


# Score trained model.
scores = model.evaluate(X_test_i, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

模型总结如下：

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_20 (Conv2D)           (None, None, None, 32)    896       
_________________________________________________________________
activation_38 (Activation)   (None, None, None, 32)    0         
_________________________________________________________________
conv2d_21 (Conv2D)           (None, None, None, 32)    9248      
_________________________________________________________________
activation_39 (Activation)   (None, None, None, 32)    0         
_________________________________________________________________
dropout_20 (Dropout)         (None, None, None, 32)    0         
_________________________________________________________________
global_max_pooling2d_9 (Glob (None, 32)                0         
_________________________________________________________________
dense_19 (Dense)             (None, 512)               16896     
_________________________________________________________________
activation_40 (Activation)   (None, 512)               0         
_________________________________________________________________
dropout_21 (Dropout)         (None, 512)               0         
_________________________________________________________________
dense_20 (Dense)             (None, 2)                 1026      
_________________________________________________________________
activation_41 (Activation)   (None, 2)                 0         
=================================================================
Total params: 28,066
Trainable params: 28,066
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10

错误如下：

ValueError: Error when checking target: expected activation_41 to have shape (2,) but got array with shape (1,)

该值确实似乎“减半”，但我尝试删除一些图层，但无法使其正常工作。

另外，如果你能推荐一个教程来更好地理解这些概念，我会全力以赴。

非常感谢你++

【问题讨论】：

保存类标签的数组的形状是什么？输出的形状和类标签数组之间可能不匹配。可能是因为您进行了二进制分类，所以您的类标签数组的形状为 (nx1)，而您的网络需要一个 (nx2) 的数组
您好 Koralp，感谢您的回答。实际上，我本身没有“标签”，因为 ImageDataGenerator.flow_from_directory 通过在不同文件夹中查找图像来完成工作（据我所知）。在这里，我只有 2 个图像像 1 个图像一样分布在 2 个不同的文件夹中（ISUP1 和 ISUP2 是名称）。在图像生成器的末尾，我有以下内容：找到属于 2 个类的 2 个图像。
嗯，是的，ImageDataGenerator 确实迭代地将数据提供给模型。但是，如果您在这里查看tensorflow.org/api_docs/python/tf/keras/preprocessing/image/… 可以看到使用class_mode 作为binary 将产生一维二进制标签，这可能会导致维度不匹配
啊！惊人的！有用！如果我想在 ImageDataGenerator 中使用二进制分类，那么我应该使用多个类 = 1。非常感谢++

标签： python tensorflow machine-learning keras deep-learning

【解决方案1】：

我认为您不应该输入n_classes=1（正如您的评论所说），因为它不是True，并且可能会带来混乱。您可以使用适用于所有情况的方法。

使用class_mode='categorical' 将适用于所有情况，无论类的数量是多少。

然后，在你的最后一层，你甚至不必手动设置类别的数量，你可以这样做：

Dense(units=len(train_data_gen.class_indices))

那么您将始终在最终神经元和类别数量之间进行匹配。然后，始终确保您有一个允许单热编码输出的损失函数，并且一切顺利（例如，categorical_crossentropy）

【讨论】：

非常感谢，确实更可靠