【问题标题】:Implementation of normalizing flows in Keras在 Keras 中实现规范化流
【发布时间】:2017-07-25 23:52:19
【问题描述】:

我一直在尝试使用 Keras 实现一个简单版本的规范化流程,如本文所述:https://arxiv.org/pdf/1505.05770.pdf

我的问题是损失总是-infinity,我无法理解我做错了什么。有谁能够帮我 ?

程序如下:

  1. 编码器生成大小为latent_dim = 100 的向量。这些是z_mean, z_log_var, u, b, w

  2. 来自z_meanz_log_var,使用重新参数化技巧我可以采样z_0 ~ N(z_mean, z_log_var)

  3. 然后我可以计算log(abs(1+u.T.dot(psi(z_0))))

  4. 然后我可以计算z_1

以下是这四个步骤的代码:

def sampling(args):
    z_mean, z_log_var = args

    # sample epsilon according to N(O,I)
    epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,
                              std=epsilon_std)

    # generate z0 according to N(z_mean, z_log_var)
    z0 = z_mean + K.exp(z_log_var / 2) * epsilon
    print('z0', z0)
    return z0

def logdet_loss(args):
    z0, w, u, b = args
    b2 = K.squeeze(b, 1)
    beta = K.sum(tf.multiply(w, z0), 1)  # <w|z0>
    linear_trans = beta + b2  # <w|z0> + b

    # change u2 so that the transformation z0->z1 is invertible
    alpha = K.sum(tf.multiply(w, u), 1)  # 
    diag1 = tf.diag(K.softplus(alpha) - 1 - alpha)
    u2 = u + K.dot(diag1, w) / K.sum(K.square(w)+1e-7)
    gamma = K.sum(tf.multiply(w,u2), 1)

    logdet = K.log(K.abs(1 + (1 - K.square(K.tanh(linear_trans)))*gamma) + 1e-6)

    return logdet

def transform_z0(args):
    z0, w, u, b = args
    b2 = K.squeeze(b, 1)
    beta = K.sum(tf.multiply(w, z0), 1)

    # change u2 so that the transformation z0->z1 is invertible
    alpha = K.sum(tf.multiply(w, u), 1)
    diag1 = tf.diag(K.softplus(alpha) - 1 - alpha)
    u2 = u + K.dot(diag1, w) / K.sum(K.square(w)+1e-7)
    diag2 = tf.diag(K.tanh(beta + b2))

    # generate z1
    z1 = z0 + K.dot(diag2,u2) 
    return z1

那么这里是损失(上面定义了logdet

def vae_loss(x, x_decoded_mean):
    xent_loss = K.mean(objectives.categorical_crossentropy(x, x_decoded_mean), -1)
    ln_q0z0 = K.sum(log_normal2(z0, z_mean, z_log_var, eps=1e-6), -1)
    ln_pz1 = K.sum(log_stdnormal(z1), -1)
    result = K.mean(logdet + ln_pz1 + xent_loss - ln_q0z0)
    return result

【问题讨论】:

  • 潜在变量的范数似乎在快速增加,在第一个 epoch 之后它已经超过 1e6

标签: python deep-learning keras autoencoder


【解决方案1】:

我在这里修改了关于 VAE 的 Keras 教程:https://github.com/sbaurdlp/keras-iaf-mnist

如果有人有兴趣看... 奇怪的是多加层并不能提高性能,而且看不出代码有什么问题

【讨论】:

    【解决方案2】:

    由于我无法使其工作,我尝试实现this 论文中描述的规范化流程:改进的变分推理 使用逆自回归流

    但是我仍然遇到了同样的损失发散问题(朝向 -infinity),这是没有意义的。我的实现一定有问题。

    以下是重要部分:

    # the encoder
    h = encoder_block(x)  # a convnet taking proteins as input (matrices of size 400x22), I don't describe it since it isn't very important
    z_log_var = Dense(latent_dim)(h)
    z_mean = Dense(latent_dim)(h)
    h_ = Dense(latent_dim)(h)
    encoder = Model(x, [z_mean,z_log_var, h_])
    
    # the latent variables (only one transformation to keep it simple)
    latent_input = Input(shape=(latent_dim, 2), batch_shape=(batch_size, latent_dim, 2))
    hl = Convolution1D(1, filter_length, activation="relu", border_mode="same")(latent_input)
    hl = Reshape((latent_dim,))(hl)
    mean_1 = Dense(latent_dim)(hl)
    std_1 = Dense(latent_dim)(hl)
    latent_model = Model(latent_input, [mean_1, std_1])
    
    # the decoder
    decoder_input = Input((latent_dim,), batch_shape=(batch_size, latent_dim))
    decoder=decoder_block()  # a convnet that I don't describe
    x_decoded_mean = decoder(decoder_input)
    generator = Model(decoder_input, x_decoded_mean)
    
    # the VAE
    z_mean, z_log_var, other = encoder(vae_input)
    eps = Lambda(sample_eps, name='sample_eps')([z_mean, z_log_var, other])
    z0 = Lambda(sample_z0, name='sample_z0')([z_mean, z_log_var, eps])
    l = Lambda(sample_l, name='sample_l')([eps, z_log_var])
    mean, std = latent_model(merge([Reshape((latent_dim,1))(z0), Reshape((latent_dim,1))(other)], mode="concat", concat_axis=-1))
    z = Lambda(transform_z0)([z0, mean, std])
    l = Lambda(transform_l)([l, std])
    x_decoded_mean = generator(z)
    vae = Model(vae_input, x_decoded_mean)
    
    # and here is the loss
    def vae_loss(x, x_decoded_mean):
        xent_loss = K.mean(objectives.categorical_crossentropy(x, x_decoded_mean), -1)
        ln_q0z0 = K.sum(log_normal2(z0, z_mean, z_log_var), -1)
        ln_pz1 = K.sum(log_stdnormal(z), -1)
        result = K.mean(l + ln_pz1 + xent_loss - ln_q0z0)
        return result
    

    这是我在上面Lambda 层中使用的 utils 函数:

    def sample_eps(args):
    
        # sample epsilon according to N(O,I)
        epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,
                                  std=epsilon_std)
    
        return epsilon
    
    def sample_z0(args):
        z_mean, z_log_var, epsilon = args
        # generate z0 according to N(z_mean, z_log_var)
        z0 = z_mean + K.exp(z_log_var / 2) * epsilon
        return z0
    
    def sample_l(args):
        epsilon, z_log_var = args
        l = -0.5*K.sum(z_log_var + epsilon**2 + K.log(2*math.pi), -1)
        return l
    
    def transform_z0(args):
        z0, mean, std = args
        z = z0
        sig_std = K.sigmoid(std)
        z *= sig_std
        z += (1-sig_std)*mean
        return z
    
    def transform_l(args):
        l, std = args
        sig_std = K.sigmoid(std)
        l -= K.sum(K.log(sig_std+1e-8), -1)
        return l
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2012-03-31
      • 2019-11-02
      • 2017-07-09
      • 2018-04-08
      • 1970-01-01
      • 2020-01-24
      • 2010-11-16
      • 2018-04-29
      相关资源
      最近更新 更多