写在最前:有问题感谢指出,感兴趣欢迎讨论~

Generative Adversarial Nets

0. Summary

同时训练两个模块(采用神经网络使其可反向传播训练),一个G model数据分布,一个D区分x是来自真实数据or来自G的假数据。最终G可以model真实数据分布,D对数据的概率都预测为1/2。
给出了GAN的理论证明和朴素训练方法。
避免了难以解决的概率计算问题

1. background

生成式模型不好因为the difficulty of approximating many intractable probabilistic computations;2)difficulty of leveraging the benefits of piecewise linear units in the generative context.我们的工作避开了。

灵感来源

2. Research Objective

生成式模型

3. Method(s)

  • 解释:

    • 直观解释
      [GAN_papers]GAN+Generative Adversarial Nets

    • 理论解释:
      [GAN_papers]GAN+Generative Adversarial Nets

  • 网络设计
    输入random noise,model为多层神经网络

  • 损失函数:
    [GAN_papers]GAN+Generative Adversarial Nets

  • 训练方法:

    • 直接训练inner loop maxD 是computational prohibitive的;并且在有限训练集上会过拟合。

    • 因此每更新k次D,更新一次G:(This results in D being maintained near its optimal solution, so long as G changes slowly enough.)
      [GAN_papers]GAN+Generative Adversarial Nets

    • 训练方法的理论证明见论文Section4

    • 启动阶段:G的训练不minimize log(1 −D(G(z)))而是maximize logD(G(z))因为早期的G太差了,log(1-D(G(z)))无法提供足够的梯度,后者可以。

4. 结果

5. 创新点

6. 实验&结果讨论方法

Conclusion

Evaluation

  1. 可用于数据不足的semi-supervised

优点:

  1. 不需要马尔科夫链
  2. no inference is needed during learning
  3. a wide variety of functions can be incorporated into the model
  4. 相较基于马尔可夫链的方法,可以model very sharp even degenerate的分布
  5. components of the input are not copied directly into the generator’s parameters

存在问题:

  1. G model的概率范围pg(x)无显式表达式
  2. D必须和G同步精细训练(in particular, G must not be trained too much without updating D, in order to avoid “the Helvetica scenario” in which G collapses too many values of z to the same value of x to have enough diversity to model pdata)

Notes

Reference

[GAN_papers]GAN+Generative Adversarial Nets

相关文章: