caffe中ConvolutionLayer的前向和反向传播解析及源码阅读

一、前向传播

在caffe中，卷积层做卷积的过程被转化成了由卷积核的参数组成的权重矩阵weights(简记为W)和feature map中的元素组成的输入矩阵(简记为C_in)的矩阵乘积W * C_in。在进行乘积之前，需要对卷积核的参数和feature map作处理，以得到W和C_in。

下面用一个例子来说名上述两个过程。假设某一卷积层输入为c X h X w = 3 X 8 X 8的feature map，卷积核大小h1 X w1 = 2 X 2，个数c1 = 4，stride = 1，pad_h = pad_w = 0。

对feature map作处理，得到C_in的过程如下图（图中描述的是输入一个样本时的处理过程，在caffe中对一个batch_size的样本，也是在一个循环中一个一个地计算对输入的卷积）。

caffe中ConvolutionLayer的前向和反向传播解析及源码阅读

从图中可以看出，多层的feature map被转化成了一个矩阵，在caffe中，这个矩阵是以行优先的存储顺序存储在一个数组中。输出feature map的高、宽分别为

h_o = (h + 2 * pad_h - h1)/stride + 1

w_o = (w + 2 * pad_w - w1)/stride + 1

col_buff(即C_in)的维度为高 × 宽 = (c × h1 × w1) × (h_o × w_o)

对卷积核的参数作处理，得到W的过程如下图

caffe中ConvolutionLayer的前向和反向传播解析及源码阅读

权重矩阵的维度为高 × 宽 = (c1) × (c × h1 × w1)。caffe中的数据存储采用Blob结构，其存储的优先顺序为样本数(num) × 通道数(c) × 高(h) × 宽(w)，w优先级最低，即在w维度上相邻元素之间的地址差是最小的。所以卷积核的参数按照blob的存储结构恰好就是一个权重矩阵W，不需要作任何处理。

下面以caffe自带的例子LeNet为例，结合源代码，来分析代码的实现过程（代码注释中参数的值是batch_size=64，网络正向传播到conv2层时的值）

网络结构如下

name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

View Code