Res-Family: From ResNet to SE-ResNeXt
ResNet(2015 Dec)
Paper
Deep Residual Learning for Image Recognition
Network Visualization
https://dgschwend.github.io/netscope/#/preset/resnet-50
Problem Statement
A paradox between neural network depth and its representation capability.
-
Intuition:
- deeper the network, stronger the representation capability
-
Observation
- network performance will degrade while network is deeper
Why
Overfit-
Gradient Vanishchecked back propogated diff, also BN can secure this. - Conjecture: deep plain nets may have exponentially low convergence rate, which impact the reducing of the training error.
Conclusion
Current plain network design impede us pursue bigger representation capability through make network deeper.
How to Solve it
可以通过构造性方法,构造出一个性能至少与对应的浅层模型相等的深度模型。当add-on block输出为0时,这个deeper net的性能与shallower net的性能一致。
从函数逼近的角度,假设我们需要逼近函数, 即残差(residual),这本质上是残差学习(residual learning)或者是boosting的想法。这也是ResNet的基本想法。
Breakdown
Residule Module
the right block is called bottleneck architecture.
Identity Shortcut and Projection Shortcut
上面的topology图中,实线即表示identity shortcut,虚线即表示projection shortcut. 出现projection shortcut的原因是该module内部的操作改变了feature map的dimension(height, width, channel),我们需要一个projection来match dimension。下图中的f指该模块的输出channel数。
Tricks
h/w和c的关系是:spatial每做一次1/2 down sample, c就乘以2, 所以
Mind Experiment
- 对ResNet-50 inference有哪些适用的优化策略?
- operator fusion
- vertical fusion
- BN folding
- conv + BN + ScaleShift -> conv
- conv + relu
- conv + relu + pooling
- conv + eltsum + relu
- BN folding
- horizontal fusion
- multi-branch fusion
- multi-branch fusion
- vertical fusion
- advanced tech
- lossless topology compression
- lossless topology compression
- other potentials
- kernel = 1 pooling optimization
- kernel = 1 pooling optimization
- operator fusion
Another Perspective
We can treat ResNet as a Stacked Boosting Ensemble.
ResNet-v2 (2016 Jul)
Paper
Identity Mappings in Deep Residual Networks
Network Visualization
http://dgschwend.github.io/netscope/#/gist/6a771cf2bf466c5343338820d5102e66
Motivation
When we express ResNet as a general formula:
The Ways
The Importance of Identity Shortcut
作者尝试了很多种方式与identity shortcut作了对比,发现效果都没有identity shortcut好。