【问题标题】:Shape mismatch with vgg16 keras: expected ndim=4, found ndim=2, shape received [None, None]形状与 vgg16 keras 不匹配:预期 ndim=4,发现 ndim=2,收到形状 [无,无]
【发布时间】:2021-01-15 02:39:26
【问题描述】:

在尝试学习 keras 和深度学习时,我想创建一个图像抠图算法,该算法使用类似于修改后的自动编码器的架构,它需要两个图像输入(一个源图像和一个用户生成的 trimap)并生成一个图像输出(图像前景的 alpha 值)。编码器部分(两个输入)是使用预训练的 VGG16 进行简单的特征提取。我想使用低分辨率 alphamatting.com 数据集训练解码器。

运行附加代码会产生错误: ValueError: Input 0 of layer block1_conv1 is incompatible with the layer: expected ndim=4, found ndim=2. Full shape received: [None, None]

我无法理解这个错误。我验证了我的 twin_gen 闭包正在为两个输入生成形状 (22, 256,256,3) 的图像批次,所以我猜问题是我以某种方式创建了错误的模型,但我没有看到错误在哪里.谁能帮我解释一下我是如何看到这个错误的?

import tensorflow as tf
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2DTranspose, Concatenate, BatchNormalization, Input
from tensorflow.keras.preprocessing.image import ImageDataGenerator


def DeConvBlock(input, num_output):
    x = Conv2DTranspose(num_output, kernel_size=3, strides=2, activation='relu', padding='same')(input)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    return x


img_input = Input((256, 256, 3))
img_vgg16 = VGG16(include_top=False, weights='imagenet')
img_vgg16._name = 'img_vgg16'
img_vgg16.trainable = False


tm_input = Input((256, 256, 3))
tm_vgg16 = VGG16(include_top=False, weights='imagenet')
tm_vgg16._name = 'tm_vgg16'
tm_vgg16.trainable = False

img_vgg16 = img_vgg16(img_input)
tm_vgg16 = tm_vgg16(tm_input)
x = Concatenate()([img_vgg16, tm_vgg16])
x = DeConvBlock(x, 512)
x = DeConvBlock(x, 256)
x = DeConvBlock(x, 128)
x = DeConvBlock(x, 64)
x = DeConvBlock(x, 32)
x = Conv2DTranspose(1, kernel_size=3, strides=1, activation='sigmoid', padding='same')(x)


m = Model(inputs=[img_input, tm_input], outputs=x)
m.summary()
m.compile(optimizer='adam', loss='mean_squared_error')

gen = ImageDataGenerator(width_shift_range=0.1, rotation_range=30, height_shift_range=0.1, horizontal_flip=True, validation_split=0.2, preprocessing_function=preprocess_input)
SEED = 49


def twin_gen(generator, subset):
    gen_img = generator.flow_from_directory('./data', classes=['input_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_map = generator.flow_from_directory('./data/trimap_training_lowres', classes=['Trimap1'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_truth = generator.flow_from_directory('./data', classes=['gt_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')

    while True:
        img = gen_img.__next__()
        tm = gen_map.__next__()
        gt = gen_truth.__next__()
        yield [[img, tm], gt]


train_gen = twin_gen(gen, 'training')
val_gen = twin_gen(gen, 'validation')


checkpoint_filepath = 'checkpoint'
checkpoint = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=True,
    monitor='val_loss',
    mode='auto',
    save_freq='epoch',
    save_best_only=True)


r = m.fit(train_gen, validation_data=val_gen, epochs=10, callbacks=[checkpoint])

【问题讨论】:

    标签: tensorflow keras deep-learning vgg-net siamese-network


    【解决方案1】:

    首先您没有指定VGG16 的输入形状,而是设置了include_top=False,因此对于channels_last 情况,默认输入形状为(None, None ,3)

    PS:你可以查看keras.applications.VGG16keras.applications.imagenet_utils.obtain_input_shape的源码了解详情。

    您可以通过调用model.summary() 看到输出None 形状:

    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_1 (InputLayer)            [(None, 256, 256, 3) 0                                            
    __________________________________________________________________________________________________
    input_3 (InputLayer)            [(None, 256, 256, 3) 0                                            
    __________________________________________________________________________________________________
    img_vgg16 (Functional)          (None, None, None, 5 14714688    input_1[0][0]                    
    __________________________________________________________________________________________________
    tm_vgg16 (Functional)           (None, None, None, 5 14714688    input_3[0][0]                    
    __________________________________________________________________________________________________
    concatenate (Concatenate)       (None, 8, 8, 1024)   0           img_vgg16[0][0]                  
                                                                     tm_vgg16[0][0]                   
    __________________________________________________________________________________________________
             
    

    要解决此问题,您只需在 VGG16 中设置 input_shape=(256, 256, 3),然后调用 model.summary() 现在将给您:

    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to
    ==================================================================================================
    input_1 (InputLayer)            [(None, 256, 256, 3) 0
    __________________________________________________________________________________________________
    input_3 (InputLayer)            [(None, 256, 256, 3) 0
    __________________________________________________________________________________________________
    img_vgg16 (Functional)          (None, 8, 8, 512)    14714688    input_1[0][0]
    __________________________________________________________________________________________________
    tm_vgg16 (Functional)           (None, 8, 8, 512)    14714688    input_3[0][0]
    __________________________________________________________________________________________________
    concatenate (Concatenate)       (None, 8, 8, 1024)   0           img_vgg16[0][0]
                                                                     tm_vgg16[0][0]
    __________________________________________________________________________________________________
                
    

    错误的主要原因是当您调用__next__() 时,它返回两个数组(data, label) 的元组,形状为((batch_size, 256, 256, 3), (batch_size, 1)),但我们真的只想要第一个。

    此外,数据生成器应该生成tuple 而不是list,否则将不会为任何变量提供梯度,因为fit 函数期望(inputs, targets) 作为数据生成器的返回。

    您还有另一个问题,即您的模型的输出形状是(batch_size, 256, 256, 1),但是当您使用color_mode='rgb' 加载gen_truth 图像时,您的gen_truth 元素形状是(batch_size, 256, 256, 3),以便获得与模型输出相同的形状如果你有灰度图像,你应该使用color_mode='grayscale' 加载gen_truth,或者使用color_mode='rgba' 加载它,如果你想使用 alpha 值,你应该得到最后一个通道值(我只是从你的问题的描述中猜测它,但你应该得到想法)

    运行没有任何问题的示例代码:

    import tensorflow as tf
    from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
    from tensorflow.keras.models import Model
    from tensorflow.keras.layers import Conv2DTranspose, Concatenate, BatchNormalization, Input
    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    
    def DeConvBlock(input, num_output):
        x = Conv2DTranspose(num_output, kernel_size=3, strides=2, activation='relu', padding='same')(input)
        x = BatchNormalization()(x)
        x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
        x = BatchNormalization()(x)
        x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
        x = BatchNormalization()(x)
        return x
    
    img_input = Input((256, 256, 3))
    img_vgg16 = VGG16(include_top=False, input_shape=(256, 256, 3), weights='imagenet')
    img_vgg16._name = 'img_vgg16'
    img_vgg16.trainable = False
    
    tm_input = Input((256, 256, 3))
    tm_vgg16 = VGG16(include_top=False, input_shape=(256, 256, 3), weights='imagenet')
    tm_vgg16._name = 'tm_vgg16'
    tm_vgg16.trainable = False
    
    img_vgg16 = img_vgg16(img_input)
    tm_vgg16 = tm_vgg16(tm_input)
    x = Concatenate()([img_vgg16, tm_vgg16])
    x = DeConvBlock(x, 512)
    x = DeConvBlock(x, 256)
    x = DeConvBlock(x, 128)
    x = DeConvBlock(x, 64)
    x = DeConvBlock(x, 32)
    x = Conv2DTranspose(1, kernel_size=3, strides=1, activation='sigmoid', padding='same')(x)
    
    m = Model(inputs=[img_input, tm_input], outputs=x)
    m.summary()
    m.compile(optimizer='adam', loss='mse')
    
    gen = ImageDataGenerator(width_shift_range=0.1, rotation_range=30, height_shift_range=0.1, horizontal_flip=True, validation_split=0.2, preprocessing_function=preprocess_input)
    SEED = 49
    
    def twin_gen(generator, subset):
        gen_img = generator.flow_from_directory('./data', classes=['input_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
        gen_map = generator.flow_from_directory('./data/trimap_training_lowres', classes=['Trimap1'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
        gen_truth = generator.flow_from_directory('./data', classes=['gt_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='grayscale')
    
        while True:
            img = gen_img.__next__()[0]
            tm = gen_map.__next__()[0]
            gt = gen_truth.__next__()[0]
            yield ([img, tm], gt)
    
    train_gen = twin_gen(gen, 'training')
    
    r = m.fit(train_gen, steps_per_epoch=5, epochs=3)
    

    【讨论】:

    • 嗯,你是对的,添加 input_shape 使模型摘要看起来更正确,但不幸的是我仍然得到同样的错误。我需要设置 input_tensor 或类似的东西吗?
    • 那你的数据生成器也有问题,我会更新答案
    • @ike 我更新了答案,看看你是否还有问题
    • 非常感谢,我相信这可以解决问题。我现在看到 CUDA 错误,但我认为这是一个完全独立的问题,我会尝试自己解决。再次感谢,现在接受您的回答。
    • cuda 错误在我这边是可以解决的,所以我已经验证你在这里的答案是正确的。再次感谢。
    猜你喜欢
    • 2021-07-13
    • 1970-01-01
    • 1970-01-01
    • 2021-01-24
    • 2021-10-18
    • 2021-12-10
    • 2019-12-17
    • 2022-12-10
    • 2020-12-29
    相关资源
    最近更新 更多