Res-Family: From ResNet to SE-ResNeXt

 

姚伟峰
http://www.cnblogs.com/Matrix_Yao/

ResNet(2015 Dec)

Paper

Deep Residual Learning for Image Recognition

Network Visualization

https://dgschwend.github.io/netscope/#/preset/resnet-50

Problem Statement

A paradox between neural network depth and its representation capability.

  • Intuition:
    • deeper the network, stronger the representation capability
  • Observation
    • network performance will degrade while network is deeper

Res-Family: From ResNet to SE-ResNeXt

Why

  • Overfit
  • Gradient Vanish checked back propogated diff, also BN can secure this.
  • Conjecture: deep plain nets may have exponentially low convergence rate, which impact the reducing of the training error.

Conclusion

Current plain network design impede us pursue bigger representation capability through make network deeper.

How to Solve it

可以通过构造性方法,构造出一个性能至少与对应的浅层模型相等的深度模型。当add-on block输出为0时,这个deeper net的性能与shallower net的性能一致。

Res-Family: From ResNet to SE-ResNeXt
从函数逼近的角度,假设我们需要逼近函数, 即残差(residual),这本质上是残差学习(residual learning)或者是boosting的想法。这也是ResNet的基本想法。

Breakdown

Residule Module

Res-Family: From ResNet to SE-ResNeXt
the right block is called bottleneck architecture.

Res-Family: From ResNet to SE-ResNeXt

Identity Shortcut and Projection Shortcut

上面的topology图中,实线即表示identity shortcut,虚线即表示projection shortcut. 出现projection shortcut的原因是该module内部的操作改变了feature map的dimension(height, width, channel),我们需要一个projection来match dimension。下图中的f指该模块的输出channel数。

Res-Family: From ResNet to SE-ResNeXt

Tricks

h/w和c的关系是:spatial每做一次1/2 down sample, c就乘以2, 所以

Mind Experiment

  1. 对ResNet-50 inference有哪些适用的优化策略?
    • operator fusion
      • vertical fusion
        • BN folding
          • conv + BN + ScaleShift -> conv
        • conv + relu
        • conv + relu + pooling
        • conv + eltsum + relu Res-Family: From ResNet to SE-ResNeXt
      • horizontal fusion
        • multi-branch fusion Res-Family: From ResNet to SE-ResNeXt
    • advanced tech
      • lossless topology compression Res-Family: From ResNet to SE-ResNeXt
    • other potentials
      • kernel = 1 pooling optimization Res-Family: From ResNet to SE-ResNeXt

Another Perspective

We can treat ResNet as a Stacked Boosting Ensemble.

ResNet-v2 (2016 Jul)

Paper

Identity Mappings in Deep Residual Networks

Network Visualization

http://dgschwend.github.io/netscope/#/gist/6a771cf2bf466c5343338820d5102e66

Motivation

When we express ResNet as a general formula:

在ResNet中也变成一个identity mapping,这个highway会更加顺畅,效果会不会更好?

 

The Ways

The Importance of Identity Shortcut

作者尝试了很多种方式与identity shortcut作了对比,发现效果都没有identity shortcut好。

Res-Family: From ResNet to SE-ResNeXt

相关文章:

  • 2022-12-23
  • 2021-07-27
  • 2022-12-23
  • 2021-05-10
  • 2021-04-29
  • 2022-01-20
  • 2021-07-22
猜你喜欢
  • 2022-01-17
  • 2021-11-29
  • 2021-08-21
  • 2021-05-17
  • 2022-01-13
  • 2021-12-25
  • 2021-11-14
相关资源
相似解决方案