【问题标题】：How to train a neural network in tensorflow如何在张量流中训练神经网络
【发布时间】：2020-02-27 23:15:10
【问题描述】：

我正在使用谷歌的 tensorflow 和 colab notbook 加载神经网络。我移除了输出层的全连接层，并添加了另一个只有一个神经元的全连接层，我冻结了另一层。我正在使用 tf.keras.application.MobileNetV2，我正在使用 mledu-datasets/cats_and_dogs。我只想训练这个添加的输出层，但我得到了一个“错误”。

我的代码如下：

_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'

path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)

PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')

train_cats_dir = os.path.join(train_dir, 'cats')  # directory with our training cat pictures
train_dogs_dir = os.path.join(train_dir, 'dogs')  # directory with our training dog pictures
validation_cats_dir = os.path.join(validation_dir, 'cats')  # directory with our validation cat pictures
validation_dogs_dir = os.path.join(validation_dir, 'dogs')  # directory with our validation dog pictures


num_cats_tr = len(os.listdir(train_cats_dir))
num_dogs_tr = len(os.listdir(train_dogs_dir))

num_cats_val = len(os.listdir(validation_cats_dir))
num_dogs_val = len(os.listdir(validation_dogs_dir))

total_train = num_cats_tr + num_dogs_tr
total_val = num_cats_val + num_dogs_val

print('total training cat images:', num_cats_tr)
print('total training dog images:', num_dogs_tr)

print('total validation cat images:', num_cats_val)
print('total validation dog images:', num_dogs_val)
print("--")
print("Total training images:", total_train)
print("Total validation images:", total_val)


batch_size = 32
epochs = 15
IMG_HEIGHT = 160
IMG_WIDTH = 160

train_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our training data
validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data

train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
                                                           directory=train_dir,
                                                           shuffle=True,
                                                           target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                           class_mode='binary')




val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
                                                              directory=validation_dir,
                                                              target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                              class_mode='binary')

sample_training_images, _ = next(train_data_gen)


# This function will plot images in the form of a grid with 1 row and 5 columns where images are placed in each column.
def plotImages(images_arr):
    fig, axes = plt.subplots(1, 5, figsize=(20,20))
    axes = axes.flatten()
    for img, ax in zip( images_arr, axes):
        ax.imshow(img)
        ax.axis('off')
    plt.tight_layout()
    plt.show()



plotImages(sample_training_images[:5])

## Create the model
model = tf.keras.applications.mobilenet_v2.MobileNetV2(input_shape=(IMG_HEIGHT, IMG_WIDTH ,3), alpha=1.0, include_top=False, weights='imagenet', input_tensor=None , pooling='max', classes=2)
model.summary()
penultimate_layer = model.layers[-2]  # layer that you want to connect your new FC layer to 
new_top_layer = tf.keras.layers.Dense(1)(penultimate_layer.output) # create new FC layer and connect it to the rest of the model
new_new_top_layer = tf.keras.layers.AveragePooling2D(pool_size=(2, 2), strides=None, padding='valid', data_format=None)(new_top_layer)
new_model = tf.keras.models.Model(inputs=model.input, outputs=new_new_top_layer)  # define your new model
new_model.summary()


for layer in new_model.layers[:-2]:
    layer.trainable = False
new_model.layers[-1].trainable = True

到训练：

new_model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

这部分代码我遇到了问题

history = new_model.fit_generator(
    train_data_gen,
    steps_per_epoch = total_train // batch_size,
    epochs = epochs,
    validation_data = val_data_gen,
    validation_steps = total_val // batch_size
)

我收到以下错误。

Epoch 1/15

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-38-55517a65f99f> in <module>()
      4     epochs = epochs,
      5     validation_data = val_data_gen,
----> 6     validation_steps = total_val // batch_size
      7 )

5 frames

/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/training_utils.py in check_loss_and_target_compatibility(targets, loss_fns, output_shapes)
    741           raise ValueError('A target array with shape ' + str(y.shape) +
    742                            ' was passed for an output of shape ' + str(shape) +
--> 743                            ' while using as loss `' + loss_name + '`. '
    744                            'This loss expects targets to have the same shape '
    745                            'as the output.')

ValueError: A target array with shape (32, 1) was passed for an output of shape (None, 2, 2, 1) while using as loss `binary_crossentropy`. This loss expects targets to have the same shape as the output.

我必须这样配置网络：

batch_size = 32
epochs = 15
IMG_HEIGHT = 160
IMG_WIDTH = 160

谢谢

【问题讨论】：

你必须在dense之前添加一个flatten层，所以你的预训练网络的输出是2D而不是4D。

标签： python-3.x tensorflow keras neural-network deep-learning

【解决方案1】：

问题在于最后一个 Dense 层；你有一个输出层有 2 个神经元，但考虑到这是使用 binary_crossentropy 的二进制分类，它应该是Dense(1)。从错误中您可以看到生成器正在创建形状为(batch_size, output_size) 的目标数组。也值得使用new_model.summary() 更好地了解每一层的输入/输出形状和训练参数。

编辑：

根据@matias-valdenegro，您还需要在Dense(1) 之前添加一个Flatten() 层才能正常工作，因为它似乎存在尺寸问题。

【讨论】：

我已将最后一层更改为 1 个神经元，但它也不起作用。 batch_size 必须是 32
我已经添加了一个Flatten()，但也没有用。 model = tf.keras.applications.mobilenet_v2.MobileNetV2(input_shape=(IMG_HEIGHT, IMG_WIDTH ,3), alpha=1.0, include_top=False, weights='imagenet', input_tensor=None , pooling='max', classes=2)penultimate_layer = model.layers[-2] # layer that you want to connect your new FC layer to new_top_layer = tf.keras.layers.Flatten() # create new FC layer and connect it to the rest of the modelnew_model = tf.keras.models.Model(model.input, new_top_layer) # define your new model
我想我必须添加一个 maxpooling 层，但我不知道该怎么做。
我无法判断 cmets 是否扭曲了您的代码，但看起来您的 flatten 和 dense(1) 层没有正确连接：model = tf.keras.applications.mobilenet_v2.MobileNetV2(input_shape=(IMG_HEIGHT, IMG_WIDTH ,3), alpha=1.0, include_top=False, weights='imagenet', input_tensor=None , pooling='max', classes=2); new_top_layer = tf.keras.layers.Flatten() (model.output); output = tf.keras.layers.Dense(1)(new_top_layer)这是您实现的吗？
是的，这就是我得到的