李宏毅 DeepLeaning-2017-Tips for DP

**1.首先控制Train Data上的结果 train set上的performance一定要作为先检查的标准

2.进而再看在Testing Data上的结果

3.1好，2不好就是Overfitting**

李宏毅 DeepLeaning-2017-Tips for DP

举例：
在train set中的performance 56的layder就比20的差，所以在testing set的performance就无法断定是overfitting

注意：自己遇到的问题找到对应合适的方法解决这个问题，就是两个分支的结果

调节方法： 李宏毅 DeepLeaning-2017-Tips for DP
在train set里面，network叠的越深，对于train的performance的结果是不好的

导致梯度消失的原因：：

改进方法：
修改activate function

ReLU 的变种

李宏毅 DeepLeaning-2017-Tips for DP

自动学习activation function maxout

李宏毅 DeepLeaning-2017-Tips for DP

如何求出activation max(z1,z2)

李宏毅 DeepLeaning-2017-Tips for DP

这种制作的activation function如何微分？？

分段求导，知道了谁最大，就是单独的linear function
李宏毅 DeepLeaning-2017-Tips for DP

调节方法二： Adaptive Learning Rate的方法

李宏毅 DeepLeaning-2017-Tips for DP
比Adagrad更好的一个方法：根据error suface表现出learning Rate的调大调小，把新的gradent考虑进去。

Momentum方法

惯性动量方法可以在某一程度上摆脱local min，考虑前一时刻的gradent
李宏毅 DeepLeaning-2017-Tips for DP
其实在计算过程中考虑了之前所有的gradent

李宏毅 DeepLeaning-2017-Tips for DP
问题：那么我们如何控制保证Train set 和 test set 的 performance的结果都是好的呢，我们都知道在lr设置没有问题时，通过增加epoc的次数可以让train set的performance越来越好，但是看testing 的performance 曲线，我们就可以看到并不是随着epoc增加越来越小，所以我们可以在每次的epoc的过程中来计算一次testing performance 称为 Validation set performance
李宏毅 DeepLeaning-2017-Tips for DP