【问题标题】:keras-tensorflow CAE dimension mismatchkeras-tensorflow CAE 维度不匹配
【发布时间】:2017-09-20 11:00:50
【问题描述】:

我基本上遵循this 指南来构建带有 tensorflow 后端的卷积自动编码器。该指南的主要区别在于我的数据是 257x257 灰度图像。以下代码:

TRAIN_FOLDER = 'data/OIRDS_gray/'
EPOCHS = 10
SHAPE = (257,257,1)

FILELIST = os.listdir(TRAIN_FOLDER)

def loadTrainData():
    train_data = []
    for fn in FILELIST:
        img = misc.imread(TRAIN_FOLDER + fn)
        img = np.reshape(img,(len(img[0,:]), len(img[:,0]), SHAPE[2]))
        if img.shape != SHAPE:
            print "image shape mismatch!"
            print "Expected: " 
            print SHAPE 
            print "but got:"
            print img.shape
            sys.exit()
        train_data.append (img)
    train_data = np.array(train_data)
    train_data = train_data.astype('float32')/ 255

    return np.array(train_data)

def createModel():
    input_img = Input(shape=SHAPE)
    x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    encoded = MaxPooling2D((2, 2), padding='same')(x)

    x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
    x = UpSampling2D((2, 2))(x)  
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    x = UpSampling2D((2, 2))(x)
    x = Conv2D(16, (3, 3), activation='relu',padding='same')(x)
    x = UpSampling2D((2, 2))(x)
    decoded = Conv2D(1, (3, 3), activation='sigmoid',padding='same')(x)
    return Model(input_img, decoded)


x_train = loadTrainData()
autoencoder = createModel()
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

print x_train.shape
autoencoder.summary()

# Run the network
autoencoder.fit(x_train, x_train,
                epochs=EPOCHS,
                batch_size=128,
                shuffle=True)

给我一​​个错误: ValueError: Error when checking target: expected conv2d_7 to have shape (None, 260, 260, 1) but got array with shape (859, 257, 257, 1)

如您所见,这不是 theano/tensorflow 后端暗淡排序的标准问题,而是其他问题。我检查了我的数据是否与print x_train.shape 应该是一样的:

(859, 257, 257, 1)

我也跑autoencoder.summary():

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 257, 257, 1)       0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 257, 257, 16)      160
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 129, 129, 16)      0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 129, 129, 8)       1160
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 65, 65, 8)         0
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 65, 65, 8)         584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 33, 33, 8)         0
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 33, 33, 8)         584
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 66, 66, 8)         0
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 66, 66, 8)         584
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 132, 132, 8)       0
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 132, 132, 16)      1168
_________________________________________________________________
up_sampling2d_3 (UpSampling2 (None, 264, 264, 16)      0
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 264, 264, 1)       145
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________

现在我不确定问题出在哪里,但看起来 conv2d_6 周围确实出现了问题(参数 # 太高)。我确实知道 CAE 的工作原理,但我对确切的技术细节还不是很熟悉,我主要通过弄乱反卷积填充来解决这个问题(而不是相同,使用有效)。我得到的暗色匹配的关闭是(None, 258, 258, 1)。我通过在反卷积端盲目尝试不同的填充组合来实现这一点,这并不是解决问题的真正聪明方法......

在这一点上我很茫然,任何帮助将不胜感激

【问题讨论】:

    标签: tensorflow keras conv-neural-network autoencoder


    【解决方案1】:

    由于您的输入和输出数据相同,因此您的最终输出形状应该与输入形状相同。

    最后一个卷积层的形状应该是(None, 257,257,1)

    出现问题是因为图像的大小为奇数 (257)。

    当您申请MaxPooling 时,它应该将数字除以二,因此它选择向上或向下舍入(向上或向下舍入,见 129,来自 257/2 = 128.5)

    稍后,当您执行UpSampling 时,模型不知道当前尺寸已四舍五入,它只是将值加倍。这依次发生在最终结果中增加了 7 个像素。

    您可以尝试裁剪结果或填充输入。

    我通常使用兼容尺寸的图像。如果您有 3 个MaxPooling 层,则您的大小应该是 2³ 的倍数。答案是 264。


    直接填充输入数据:

    x_train = numpy.lib.pad(x_train,((0,0),(3,4),(3,4),(0,0)),mode='constant')
    

    这需要SHAPE=(264,264,1)

    模型内部的填充:

    import keras.backend as K
    
    input_img = Input(shape=SHAPE)
    x = Lambda(lambda x: K.spatial_2d_padding(x, padding=((3, 4), (3, 4))), output_shape=(264,264,1))(input_img)
    

    裁剪结果:

    在您不直接更改实际数据(numpy 数组)的任何情况下都需要这样做。

    decoded = Lambda(lambda x: x[:,3:-4,3:-4,:], output_shape=SHAPE)(x)
    

    【讨论】:

    • 非常感谢,这解决了我遇到的所有问题。我需要一些时间才能通过手动调试自己实现,尽管事后看来很清楚。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-12-20
    • 2017-07-19
    • 2018-06-17
    • 1970-01-01
    • 2023-03-17
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多