【发布时间】:2018-06-13 21:56:49
【问题描述】:
我是 Keras 的新手,我有一个问题,给定一张图像,我必须制作一个卷积神经网络,根据它输出另一张图像。
现在我在互联网上看到的所有示例都包含分类问题,其中每个图像都被赋予一个带有热编码的标签。我想给一张图片作为标签。
【问题讨论】:
-
任何图像自动编码器都使用图像作为标签。
标签: machine-learning keras computer-vision
我是 Keras 的新手,我有一个问题,给定一张图像,我必须制作一个卷积神经网络,根据它输出另一张图像。
现在我在互联网上看到的所有示例都包含分类问题,其中每个图像都被赋予一个带有热编码的标签。我想给一张图片作为标签。
【问题讨论】:
标签: machine-learning keras computer-vision
一系列渐进卷积之后可以跟随一系列调整大小的插值,例如在这样的层中实现:
class Interpolation(Layer):
def __init__(self, output_dim, num_channels, **kwargs):
self.num_channels = num_channels
self.output_dim = output_dim
super(Interpolation, self).__init__(**kwargs)
def build(self, input_shape):
super(Interpolation, self).build(input_shape)
def call(self, x):
return K.tf.image.resize_bilinear(x, self.output_dim)
def compute_output_shape(self, input_shape):
return input_shape[0], input_shape[1] *2 , input_shape[2]* 2, self.num_channels
然后,您可以应用一系列转换,从而生成与您的输入形状匹配的输出形状。下面是一个展示该层使用的示例代码:
# grayscale in
uncolored = Input(shape=(200,200,1,))
# first block 200x200x3
conv0 = Conv2D(3, (3,3), padding='same', activation='relu', data_format='channels_last', name='0', kernel_regularizer='l2')(uncolored)
bn0 = BatchNormalization()(conv0)
# second block 200x200x64
conv1 = Conv2D(64, (3,3), padding='same', activation='relu', data_format='channels_last', kernel_regularizer='l2')(conv0)
bn1 = BatchNormalization()(conv1) # 200x200x64
pool0 = MaxPooling2D(pool_size=2, padding='same')(conv0) # 100x100x64
# third block # 100x100x128
conv2 = Conv2D(128, (3,3), padding='same', activation='relu', data_format='channels_last', kernel_regularizer='l2')(pool0)
bn2 = BatchNormalization()(conv2) # 100 x 100 x 128
pool1 = MaxPooling2D(pool_size=2, padding='same')(conv2) # 50x50x128
# fourth block 50x50x256
conv3 = Conv2D(256, (3,3), padding='same', activation='relu', data_format='channels_last', name='2', kernel_regularizer='l2')(pool1)
bn3 = BatchNormalization()(conv3) # 50 x 50 x 256
pool2 = MaxPooling2D(pool_size=2, padding='same')(conv3) # 25 x 25 x 256
# fifth block 25 x 25 x 512
conv4 = Conv2D(512, (3,3), padding='same', activation='relu', data_format='channels_last', kernel_regularizer='l2')(pool2)
bn4 = BatchNormalization()(conv4)
rconv0 = Conv2D(256, (1,1), padding='same', activation='sigmoid', data_format='channels_last', kernel_regularizer='l2')(conv4)
# first upscale
interp_layer0 = Interpolation(output_dim=(50,50),
num_channels=256) (rconv0) #
# first addition
intermediate_0 = Add()([interp_layer0, bn3])
rconv1 = Conv2D(128, (3,3), padding='same',
activation='sigmoid', data_format='channels_last')(intermediate_0)
# second upscale
interp_layer1 = Interpolation(output_dim=(100,100),
num_channels=128)(rconv1)
# second addition
intermediate_1 = Add()([interp_layer1, bn2])
rconv2 = Conv2D(64, (3,3), padding='same',
activation='sigmoid', data_format='channels_last')(intermediate_1)
# third upscale
interp_layer2 = Interpolation(output_dim=(200,200),
num_channels= 64)(rconv2)
# third addition
intermediate_2 = Add()([interp_layer2, bn1 ])
rconv3 = Conv2D(3, (3,3), padding='same',
activation='sigmoid', data_format='channels_last')(intermediate_2)
# fourth addition
intermediate_3 = Add()([rconv3,bn0])
rconv4 = Conv2D(3, (3,3), padding='same', activation='sigmoid', data_format='channels_last')(intermediate_3)
model = Model(inputs=[uncolored], outputs=[rconv4])
以及模型总结:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) (None, 200, 200, 1) 0
__________________________________________________________________________________________________
0 (Conv2D) (None, 200, 200, 3) 30 input_5[0][0]
__________________________________________________________________________________________________
max_pooling2d_13 (MaxPooling2D) (None, 100, 100, 3) 0 0[0][0]
__________________________________________________________________________________________________
conv2d_34 (Conv2D) (None, 100, 100, 128 3584 max_pooling2d_13[0][0]
__________________________________________________________________________________________________
max_pooling2d_14 (MaxPooling2D) (None, 50, 50, 128) 0 conv2d_34[0][0]
__________________________________________________________________________________________________
2 (Conv2D) (None, 50, 50, 256) 295168 max_pooling2d_14[0][0]
__________________________________________________________________________________________________
max_pooling2d_15 (MaxPooling2D) (None, 25, 25, 256) 0 2[0][0]
__________________________________________________________________________________________________
conv2d_35 (Conv2D) (None, 25, 25, 512) 1180160 max_pooling2d_15[0][0]
__________________________________________________________________________________________________
conv2d_36 (Conv2D) (None, 25, 25, 256) 131328 conv2d_35[0][0]
__________________________________________________________________________________________________
interpolation_13 (Interpolation (None, 50, 50, 256) 0 conv2d_36[0][0]
__________________________________________________________________________________________________
batch_normalization_24 (BatchNo (None, 50, 50, 256) 1024 2[0][0]
__________________________________________________________________________________________________
add_17 (Add) (None, 50, 50, 256) 0 interpolation_13[0][0]
batch_normalization_24[0][0]
__________________________________________________________________________________________________
conv2d_37 (Conv2D) (None, 50, 50, 128) 295040 add_17[0][0]
__________________________________________________________________________________________________
interpolation_14 (Interpolation (None, 100, 100, 128 0 conv2d_37[0][0]
__________________________________________________________________________________________________
batch_normalization_23 (BatchNo (None, 100, 100, 128 512 conv2d_34[0][0]
__________________________________________________________________________________________________
add_18 (Add) (None, 100, 100, 128 0 interpolation_14[0][0]
batch_normalization_23[0][0]
__________________________________________________________________________________________________
conv2d_38 (Conv2D) (None, 100, 100, 64) 73792 add_18[0][0]
__________________________________________________________________________________________________
conv2d_33 (Conv2D) (None, 200, 200, 64) 1792 0[0][0]
__________________________________________________________________________________________________
interpolation_15 (Interpolation (None, 200, 200, 64) 0 conv2d_38[0][0]
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 200, 200, 64) 256 conv2d_33[0][0]
__________________________________________________________________________________________________
add_19 (Add) (None, 200, 200, 64) 0 interpolation_15[0][0]
batch_normalization_22[0][0]
__________________________________________________________________________________________________
conv2d_39 (Conv2D) (None, 200, 200, 3) 195 add_19[0][0]
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 200, 200, 3) 12 0[0][0]
__________________________________________________________________________________________________
add_20 (Add) (None, 200, 200, 3) 0 conv2d_39[0][0]
batch_normalization_21[0][0]
__________________________________________________________________________________________________
conv2d_40 (Conv2D) (None, 200, 200, 1) 28 add_20[0][0]
==================================================================================================
Total params: 1,982,921
Trainable params: 1,982,019
Non-trainable params: 902
_____________________________
在示例中,我们从单通道图像转移到多通道图像。对于任何大小/通道数量的图像,都可以复制相同的想法。确切的网络架构当然取决于您想要的功能。
【讨论】:
from keras import backend as K 网络的其余部分只是作为一个例子来说明如何使用这个层调整卷积的大小并在输出端得到类似“图像”的形状。