【问题标题】:Why my generative adversarial network (gan) is not converging during training?为什么我的生成对抗网络 (gan) 在训练期间没有收敛?
【发布时间】:2020-03-04 10:29:19
【问题描述】:

我正在构建一个 GAN 来检测图像中的异常。我在 keras 上构建了我的模型,因为我只熟悉 keras。生成器和判别器都是自动编码器,我定义了自己的损失函数。这个模型的概念是这样的:模型只在正常图像上训练,并在正常和异常图像上进行测试,由于模型在训练过程中只看到正常图像,它无法像正常图像一样重建异常图像,因此大的重建误差可以用作异常的指示。使用 2 个自动编码器的原因是第二个重建图像的距离会比原始输入更大,因此它可以更好地分离异常图像。

我不知道出了什么问题,但是无论我训练多少批次,模型都不会收敛。我尝试只用一个输出构建我的鉴别器和 gan,但它并没有改善结果。

# Build model

import keras
from keras import backend as K
from keras.layers import ReLU, LeakyReLU, Conv2D, Conv2DTranspose, BatchNormalization, concatenate, Flatten, Dense, Reshape
from keras.models import Model, clone_model
import numpy as np


# Build autoencoder to be the generator

img_shape = (152, 232, 1) # This is the shape of my input images
latent_dim = 16

inputs = keras.Input(shape=img_shape)
x = Conv2D(16, 3, padding='same', strides=(2,2), activation='relu')(inputs)
x = BatchNormalization()(x) 
x = Conv2D(32, 3, padding='same', strides=(2,2), activation='relu')(x)
x = BatchNormalization()(x)
shape = K.int_shape(x)
x = Flatten()(x)
latent = Dense(latent_dim, name='latent_vector')(x)
x = Dense(shape[1] * shape[2] * shape[3])(latent)
x = Reshape((shape[1], shape[2], shape[3]))(x)
x = Conv2DTranspose(32, 3, padding='same')(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
x = Conv2DTranspose(16, 3, padding='same', strides=(2,2))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
outputs = Conv2DTranspose(1, 3, padding='same', activation='tanh', strides=(2,2))(x)

generator = Model(inputs, outputs)

# Make a second autoencoder to be used as discriminator
ae_disc = clone_model(generator)
ae_disc.name="autoencoder_discriminator"

# Freeze the weights of generator
generator.trainable = False 

gen_outputs = generator(inputs)
dis_outputs_1 = ae_disc(inputs)
dis_outputs_2 = ae_disc(gen_outputs)

# Build discriminator
discriminator = Model(inputs, [dis_outputs_1, dis_outputs_2])

# Define loss function for discriminator 
loss_d = K.sum(K.abs(inputs - dis_outputs_1)) - K.sum(K.abs(gen_outputs - dis_outputs_2))
discriminator.add_loss(loss_d)

# Compile discriminator
discriminator_optimizer = keras.optimizers.RMSprop(lr=0.0008, clipvalue=1.0, decay=1e-8)
discriminator.compile(optimizer=discriminator_optimizer)

# Freeze autoenconder and unfreeze generator
ae_disc.trainable = False 
generator.trainable = True 

gen_outputs = generator(inputs)
gan_outputs_1 = ae_disc(inputs) 
gan_outputs_2 = ae_disc(gen_outputs)

# Build gan
gan = Model(inputs, [gan_outputs_1, gan_outputs_2]) 

# Define gan loss
loss_g = K.sum(K.abs(inputs - gen_outputs)) + K.sum(K.abs(gen_outputs - gan_outputs_2))
gan.add_loss(loss_g)

# Compile gan
gan_optimizer = keras.optimizers.RMSprop(lr=0.0008, clipvalue=1.0, decay=1e-8)
gan.compile(optimizer=gan_optimizer)



# Train model

# Squeeze pixel values into [-1, 1] since I use 'tanh' as activation for the autoencoder output
x_train = train_imgs.astype('float32') / 255.*2-1 

batch_size = 20

start = 0
for step in range(1000):
    stop = start + batch_size
    images = x_train[start: stop]

    d_loss = discriminator.train_on_batch(images, None)    
    g_loss = gan.train_on_batch(images, None)

    start += batch_size
    if start > len(x_train) - batch_size:
        start = 0

    # Print losses
    if step % 10 == 0:
        # Print metrics
        print('discriminator loss at step %s: %s' % (step, d_loss))
        print('generator loss at step %s: %s' % (step, g_loss))   

我预计 g_loss 和 d_loss 会越来越小,但是在几批之后它们就下降了,并且一直在波动而没有下降,我很确定它没有过度拟合,因为当我使用经过训练的模型来预测测试图像,结果是超级模糊的图像。

【问题讨论】:

    标签: deep-learning autoencoder anomaly-detection generative-adversarial-network


    【解决方案1】:

    其实我对我的模型还是挺怀疑的,因为discriminator是为了依赖generator而构建的,而且和gan基本是一样的,这让我觉得在训练的时候,generator和discriminator并没有真正的竞争互相敌对。此外,如果鉴别器和 gan 的输入和输出相同,这意味着在推理期间,我可以使用鉴别器或 gan 来重建新图像,这似乎也很可疑。所以我尝试将生成器和鉴别器(2 个输入和 2 个输出)构建为两个独立的网络,并将它们链接在一起以构建 gan。但不幸的是,当我训练 gan 时,一个错误一直告诉我必须将值提供给鉴别器的第一层。

    import keras
    from keras import backend as K
    from keras.layers import ReLU, LeakyReLU, Conv2D, Conv2DTranspose, BatchNormalization, concatenate, Flatten, Dense, Reshape
    from keras.models import Model, clone_model, load_model
    import numpy as np
    
    
    K.clear_session()
    
    # Build autoencoder to be the generator
    
    img_shape = (152, 232, 1)
    latent_dim = 16
    
    inputs = keras.Input(shape=img_shape)
    x = Conv2D(16, 3, padding='same', strides=(2,2), activation='relu')(inputs)
    x = BatchNormalization()(x)
    x = Conv2D(32, 3, padding='same', strides=(2,2), activation='relu')(x)
    x = BatchNormalization()(x)
    shape = K.int_shape(x)
    x = Flatten()(x)
    latent = Dense(latent_dim, name='latent_vector')(x)
    x = Dense(shape[1] * shape[2] * shape[3])(latent)
    x = Reshape((shape[1], shape[2], shape[3]))(x)
    x = Conv2DTranspose(32, 3, padding='same')(x)
    x = LeakyReLU()(x)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(16, 3, padding='same', strides=(2,2))(x)
    x = LeakyReLU()(x)
    x = BatchNormalization()(x)
    outputs = Conv2DTranspose(1, 3, padding='same', activation='tanh', strides=(2,2))(x)
    
    generator = Model(inputs, outputs)
    generator.summary()
    
    ae_disc = clone_model(generator)
    ae_disc.name="autoencoder_discriminator"
    
    inputs_1 = keras.Input(shape=img_shape)
    inputs_2 = keras.Input(shape=img_shape)
    dis_outputs_1 = ae_disc(inputs_1)
    dis_outputs_2 = ae_disc(inputs_2)
    
    # Build discriminator
    discriminator = Model([inputs_1, inputs_2], [dis_outputs_1, dis_outputs_2])
    
    # Define loss function for discriminator
    loss_d = K.sum(K.abs(inputs_1 - dis_outputs_1)) - K.sum(K.abs(inputs_2 - dis_outputs_2))
    discriminator.add_loss(loss_d)
    
    # Compile discriminator
    discriminator_optimizer = keras.optimizers.RMSprop(lr=0.0008, clipvalue=1.0, decay=1e-8)
    discriminator.compile(optimizer=discriminator_optimizer)
    discriminator.summary()
    
    # Freeze discriminator
    discriminator.trainable = False 
    
    gan_inputs = keras.Input(shape=img_shape)
    dis_input_1 = keras.activations.linear(gan_inputs)
    dis_input_2 = generator(gan_inputs)
    [gan_outputs_1, gan_outputs_2] = discriminator([dis_input_1, dis_input_2])
    
    # Build gan
    gan = Model(gan_inputs, [gan_outputs_1, gan_outputs_2]) 
    
    # Define gan loss
    loss_g = K.sum(K.abs(gan_inputs - dis_input_2)) + K.sum(K.abs(dis_input_2 - gan_outputs_2))
    gan.add_loss(loss_g)
    
    # Compile gan
    gan_optimizer = keras.optimizers.RMSprop(lr=0.0008, clipvalue=1.0, decay=1e-8)
    gan.compile(optimizer=gan_optimizer)
    gan.summary()
    
    # Train model 
    
    # Squeeze pixel values into [-1, 1] since I use 'tanh' as activation for the autoencoder output
    x_train = train_imgs.astype('float32') / 255.*2-1 
    
    batch_size = 20
    
    start = 0
    for step in range(1000):
        stop = start + batch_size
        images = x_train[start: stop]
        generated_images = generator.predict(images)
    
        d_loss = discriminator.train_on_batch([images, generated_images], None)    
        g_loss = gan.train_on_batch(images, None)
    
        start += batch_size
        if start > len(x_train) - batch_size:
            start = 0
    
        # Print losses
        if step % 10 == 0:
            # Print metrics
            print('discriminator loss at step %s: %s' % (step, d_loss))
            print('generator loss at step %s: %s' % (step, g_loss))  
    

    这是我收到的错误消息: InvalidArgumentError:您必须使用 dtype float 和 shape [?,152,232,1] 为占位符张量“input_2”提供一个值 [[{{node input_2}} = Placeholderdtype=DT_FLOAT, shape=[?,152,232,1], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

    有谁知道如何解决这个问题?非常感谢提前!

    【讨论】:

    • 将此作为更新添加到您的问题,而不是作为您自己问题的答案。
    猜你喜欢
    • 2020-09-25
    • 2017-10-12
    • 1970-01-01
    • 2019-11-26
    • 1970-01-01
    • 1970-01-01
    • 2015-08-03
    • 2019-11-28
    • 1970-01-01
    相关资源
    最近更新 更多