写在最前:有问题感谢指出,感兴趣欢迎讨论~
Generative Adversarial Nets
0. Summary
同时训练两个模块(采用神经网络使其可反向传播训练),一个G model数据分布,一个D区分x是来自真实数据or来自G的假数据。最终G可以model真实数据分布,D对数据的概率都预测为1/2。
给出了GAN的理论证明和朴素训练方法。
避免了难以解决的概率计算问题
1. background
生成式模型不好因为the difficulty of approximating many intractable probabilistic computations;2)difficulty of leveraging the benefits of piecewise linear units in the generative context.我们的工作避开了。
灵感来源
2. Research Objective
生成式模型
3. Method(s)
-
解释:
-
直观解释
-
理论解释:
-
-
网络设计
输入random noise,model为多层神经网络 -
损失函数:
-
训练方法:
-
直接训练inner loop maxD 是computational prohibitive的;并且在有限训练集上会过拟合。
-
因此每更新k次D,更新一次G:(This results in D being maintained near its optimal solution, so long as G changes slowly enough.)
-
训练方法的理论证明见论文Section4
-
启动阶段:G的训练不minimize log(1 −D(G(z)))而是maximize logD(G(z))因为早期的G太差了,log(1-D(G(z)))无法提供足够的梯度,后者可以。
-
4. 结果
5. 创新点
6. 实验&结果讨论方法
Conclusion
Evaluation
- 可用于数据不足的semi-supervised
优点:
- 不需要马尔科夫链
- no inference is needed during learning
- a wide variety of functions can be incorporated into the model
- 相较基于马尔可夫链的方法,可以model very sharp even degenerate的分布
- components of the input are not copied directly into the generator’s parameters
存在问题:
- G model的概率范围pg(x)无显式表达式
- D必须和G同步精细训练(in particular, G must not be trained too much without updating D, in order to avoid “the Helvetica scenario” in which G collapses too many values of z to the same value of x to have enough diversity to model pdata)