1. https://arxiv.org/pdf/1512.03385.pdf#page=9&zoom=100,0,157
  2. 摘要
    1.  residual 残余的
    2. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
  3. introduction
    1. Recent evidence reveals that network depth is of crucial importance, and the leading results on the challenging ImageNet dataset all exploit “very deep” models, with a depth of sixteen to thirty.
    2. 深的网络带来的问题:
      1. the notorious problem of vanishing/exploding gradients 梯度弥散,梯度爆炸,batch normalization
      2. with the network depth increasing, accuracy gets saturated (饱和)(which might be unsurprising) and then degrades rapidly
    3. model的本质是mapping,将经典cnn之resnet变为经典cnn之resnet
    4. shortcut connection
      1. 经典cnn之resnet
      2. 没增加参数量,也不增加运算量
      3. 经典cnn之resnet     or      经典cnn之resnet
  4. related work
    1. residual representation
    2. shortcut connection
  5. deep residual learning
    1. residual learning
      1. If one hypothesizes that multiple nonlinear layers can asymptotically approximate complicated functions2 , then it is equivalent to hypothesize that they can asymptotically approximate the residual functions, i.e., H(x) − x (assuming that the input and output are of the same dimensions),两种拟合方式的训练难易程度可能不同。
      2. The degradation problem suggests that the solvers might have difficulties in approximating identity mappings by multiple nonlinear layers.因为deeper的模型并没有更优的性能。
    2. Identity Mapping by Shortcuts
      1. a shortcut connection and element-wise addition
      2. 2 residual function 
        1. 经典cnn之resnet
    3. 网络结构
      1. 经典cnn之resnet

      2. When the dimensions increase , we consider two options: (A) The shortcut still performs identity mapping, with extra zero entries padded for increasing dimensions. This option introduces no extra parameter; (B) The projection shortcut in Eqn.(2) is used to match dimensions (done by 1×1 convolutions). For both options, when the shortcuts go across feature maps of two sizes, they are performed with a stride of 2.

      3. convolution+bn+activation

  6. 其他模型:

    1. 经典cnn之resnet

相关文章:

  • 2021-04-27
  • 2021-10-18
  • 2022-01-30
  • 2021-09-12
  • 2021-10-24
  • 2021-05-04
  • 2021-05-18
  • 2022-12-23
猜你喜欢
  • 2021-07-05
  • 2022-01-19
  • 2021-05-08
  • 2021-11-26
  • 2021-04-28
  • 2021-12-20
  • 2021-11-22
相关资源
相似解决方案