【问题标题】:What should be filters and kernel size in Conv2DTranspose?Conv2DTranspose 中的过滤器和内核大小应该是多少?
【发布时间】:2021-03-17 10:51:21
【问题描述】:

我正在尝试创建一个简单的 GAN,但无法选择正确的参数。 考虑下面的生成器和鉴别器代码。它产生 (HEIGHT = 32 宽度 = 54)。

def build_generator(latent_size=100):
    # we will map a pair of (z, L), where z is a latent vector and L is a
    # label drawn from P_c, to image space (..., 54, 32, 3)
    cnn = Sequential()

    cnn.add(Dense(3*54*32, input_dim=latent_size, activation='relu'))
    cnn.add(Reshape((4, 3, 432)))

    # upsample to (8, 6, ...)
    cnn.add(Conv2DTranspose(192, 2, strides=2, padding='valid',
                        activation='relu',
                        kernel_initializer='glorot_normal'))
    cnn.add(BatchNormalization())

    # upsample to (16, 18, ...)
    cnn.add(Conv2DTranspose(96, 5, strides=(2,3), padding='same',
                        activation='relu',
                        kernel_initializer='glorot_normal'))
    cnn.add(BatchNormalization())

    # upsample to (32, 54, ...)
    cnn.add(Conv2DTranspose(3, 5, strides=(2,3), padding='same',
                        activation='tanh',
                        kernel_initializer='glorot_normal'))


    # this is the z space commonly referred to in GAN papers
    latent = Input(shape=(latent_size, ))

    # this will be our label
    image_class = Input(shape=(1,), dtype='int32')

    cls = Embedding(num_classes, latent_size,
                    embeddings_initializer='glorot_normal')(image_class)

    # hadamard product between z-space and a class conditional embedding
    h = layers.multiply([latent, cls])

    fake_image = cnn(h)

    return Model([latent, image_class], fake_image)


def build_discriminator():
    # build a relatively standard conv net, with LeakyReLUs as suggested in
    # the reference paper
    cnn = Sequential()

    cnn.add(Conv2D(32, 3, padding='same', strides=2,
                   input_shape=(32, 54, 3)))
    cnn.add(LeakyReLU(0.2))
    cnn.add(Dropout(0.3))

    cnn.add(Conv2D(64, 3, padding='same', strides=1))
    cnn.add(LeakyReLU(0.2))
    cnn.add(Dropout(0.3))

    cnn.add(Conv2D(128, 3, padding='same', strides=2))
    cnn.add(LeakyReLU(0.2))
    cnn.add(Dropout(0.3))

    cnn.add(Conv2D(256, 3, padding='same', strides=1))
    cnn.add(LeakyReLU(0.2))
    cnn.add(Dropout(0.3))

    cnn.add(Flatten())

    image = Input(shape=(32, 54, 3))

    features = cnn(image)

    # first output (name=generation) is whether or not the discriminator
    # thinks the image that is being shown is fake, and the second output
    # (name=auxiliary) is the class that the discriminator thinks the image
    # belongs to.
    fake = Dense(1, activation='sigmoid', name='generation')(features)
    aux = Dense(num_classes, activation='softmax', name='auxiliary')(features)

    return Model(image, [fake, aux])

但我想生成尺寸为 (200, 200) 而不是 (54, 32) 的图像。我曾尝试更改图层中的几个参数,但总是出现此错误:

ValueError: Input 0 of layer auxiliary is incompatible with the layer: expected axis -1 of input shape to have value 4000000 but received input with shape (None, 179200)

应更改哪些参数以生成形状为 (200, 200) 的图像?

【问题讨论】:

    标签: python machine-learning keras deep-learning conv-neural-network


    【解决方案1】:

    一个简单的解决方案是从这里开始:

    cnn.add(Dense(25*25*432, input_dim=latent_size, activation='relu'))
    cnn.add(Reshape((25, 25, 432)))
    

    然后反卷积3次到25x2x2x2 = 200

    cnn.add(Conv2DTranspose(192, 2, strides=2, padding='valid',
                        activation='relu',
                        kernel_initializer='glorot_normal'))
    cnn.add(BatchNormalization())
    
    cnn.add(Conv2DTranspose(96, 2, strides=2, padding='valid',
                        activation='relu',
                        kernel_initializer='glorot_normal'))
    cnn.add(BatchNormalization())
    
    cnn.add(Conv2DTranspose(3, 2, strides=2, padding='valid',
                        activation='relu',
                        kernel_initializer='glorot_normal'))
    cnn.add(BatchNormalization())
    

    【讨论】:

    • 嗨@maggu。感谢你的回答。我已按照您的指示进行操作,但无法训练模型。请看一下这个 colab colab.research.google.com/drive/…。我希望分享我的 colab 链接不会违反任何规则。
    • 你也可以检查 build_discriminator() 代码吗?我不确定它是否正确。
    • 因此您还需要将判别器调整为所需的图像大小,在这种情况下 input_shape=(200, 200, 3) 以及以后的 image = Input(shape=(200, 200, 3 ))...那它有用吗?
    • 在进行您所说的更改后,我收到此错误:“ValueError:连接轴的所有输入数组维度必须完全匹配,但沿着维度 1,索引 0 处的数组大小为 32 并且索引 1 处的数组大小为 200"
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-03-22
    • 2019-03-28
    • 2013-08-06
    • 2019-04-28
    • 1970-01-01
    • 2019-04-23
    相关资源
    最近更新 更多