Week1: Fundations of Convolutional Neural Networks

1. Convolution on 2D images

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

  • input: n×n{n\times n}
  • padding填充: p{p}
  • stride步长:s{s}
  • kernel: f×f{f\times f} (must be odd !)
  • output:(n+2pfs+1)×(n+2pfs+1){\left \lfloor(\dfrac{n+2p-f}{s}+1)\right\rfloor \times \left \lfloor(\dfrac{n+2p-f}{s}+1)\right\rfloor}(向下取整)

1.1 valid convolution有效卷积

  • input: n×n{n\times n}
  • padding填充: p=0{p=0}
  • stride步长:s=1{s=1}
  • kernel: f×f{f\times f} (must be odd !)
  • output:(nf+1)×(nf+1){(n-f+1) \times (n-f+1) }

1.2 same convolution相同卷积

  • input: n×n{n\times n}
  • padding填充: p=(f1)/2{p=(f-1)/2}
  • stride步长:s=1{s=1}
  • kernel: f×f{f\times f} (must be odd !)
  • output:(n+2pf+1)×(n+2pf+1)=n×n{(n+2p-f+1) \times (n+2p-f+1) = n\times n}

2. Convolution vs.cross-correlation卷积vs互相关函数

  • 目前我们理解的卷积,其实叫做互相关函数(cross-correlation),其公式描述如下:
    S(i,j)=(IK)(i,j)=m,nI(i+m,j+n)K(m,n)(1){S(i,j)=(I*K)(i,j)=\sum_{m,n}I(i+m,j+n)K(m,n)}\tag{1}

  • 数学中卷积的定义与我们正在目前理解的图像卷积不同,其对核进行了翻转(flip),即将核分别沿水平轴和竖直轴翻转一次(注意不是转置),其公式描述如下:
    S(i,j)=(IK)(i,j)=m,nI(m,n)K(im,jn)(2){S(i,j)=(I*K)(i,j)=\sum_{m,n}I(m,n)K(i-m,j-n)}\tag{2}

  • 对核进行翻转的唯一目的是实现可交换性,即式(2)与下式(3)等价:
    S(i,j)=(KI)(i,j)=m,nI(im,jn)K(m,n)(3){S(i,j)=(K*I)(i,j)=\sum_{m,n}I(i-m,j-n)K(m,n)}\tag{3}

  • 在神经网络中,可交换性不是重要的性质,因此,在神经网络中,这两种运算都称为卷积。

3. Convolution on RGB images

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

4. One layer of a convolutional network

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1
Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

5. Summary of notation

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

If layer l{l} is a convolution layer:

  • input: nH[l1]×nW[l1]×nC[l1]{n_H^{[l-1]}\times n_W^{[l-1]}\times n_C^{[l-1]}} (Highth*Width*Channel)
  • output: nH[l]×nW[l]×nC[l]n_H^{[l]} \times n_W^{[l]} \times n_C^{[l]}
  • 其中:
    nH[l]=(nH[l1]+2p[l]f[l]s+1){n_H^{[l]}=\left \lfloor(\dfrac{n_H^{[l-1]}+2p^{[l]} - f^{[l]}}{s}+1)\right\rfloor}
    nW[l]=(nW[l1]+2p[l]f[l]s+1){n_W^{[l]}=\left \lfloor(\dfrac{n_W^{[l-1]}+2p^{[l]} - f^{[l]}}{s}+1)\right\rfloor}
    nC[l]=n_C^{[l]}= number of filters
    f[l]=f^{[l]}= filter size
    p[l]=p^{[l]}= padding
    s[l]=s^{[l]}= stride
  • Each filter is f[l]×f[l]×nC[l1]f^{[l]} \times f^{[l]} \times n_C^{[l-1]}
  • Activations: a[l]=nH[l]×nW[l]×nC[l]a^{[l]}=n_H^{[l]} \times n_W^{[l]} \times n_C^{[l]}
  • Weights: f[l]×f[l]×nC[l1]×nC[l]f^{[l]} \times f^{[l]} \times n_C^{[l-1]} \times n_C^{[l]}
  • Bias: nC[l]n_C^{[l]}

6. Example of ConvNet

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

7. Types of layers in a convolutional network:

  • Convolution(CONV)
  • Polling(POOL)
  • Fully connected(FC)

8. Pooling Layers池化层

  • 池化函数使用某一位置的相邻输出的总体统计特征来代替网络在该位置的输出,如最大池化、平均池化等
  • 池化函数是固定的,不需要通过学习来获得
  • 池化层可以极大的提高网络的效率
  • 可调参数:
    - f:f:filter size
    - s:s:stride
    - Pooling type: Max or average
    - p:p:padding (rarely used)

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

  • input: nH×nW×nC{n_H\times n_W\times n_C}
  • output:(nHfs+1)×(nWfs+1)×nC{\left \lfloor(\dfrac{n_H-f}{s}+1)\right\rfloor \times \left \lfloor(\dfrac{n_W-f}{s}+1)\right\rfloor \times n_C}

9.CNN example

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1
Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

  • 随着网络深度的增加,层的高度nHn_H和宽度nWn_W在不断减小,通道数nCn_C在不断增加;
  • 随着网络深度的增加,层的总体元素个数(Activation Size)在不断减小;
  • 卷积层(CONV)的学习参数要小于全连接层(FC)的学习参数;

10. Why convolutions?

  • 稀疏交互Sparse interactions
  • 参数共享Parameter sharing

11. Training CNN

Notes on Convolutional Neural Networks (from deeplearning.ai) WEEK 1

相关文章: