经典卷积神经网络结构——AlexNet网络结构详解（卷积神经网络入门，Keras代码实现）

背景简介

在 LeNet 问世后的第4年，2012年， AlexNet 在 ImageNet LSVRC-2010 数据集上对1000个类别的图像进行分类取得了当时最好的效果；同时在 ILSVRC-2012 数据集上取得了当时第一的成绩。在AlexNet之后，图像分类模型越来越复杂，网络也越来越 deep。现在 AlexNet 仍然可以应用到小数据集上，做一些验证性的实验。

原论文

https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
文章发在 NIPS 上。截止目前引用次数为36782！

个人 Github 实现

https://github.com/uestcsongtaoli/AlexNet

模型介绍

经典卷积神经网络结构——AlexNet网络结构详解（卷积神经网络入门，Keras代码实现）
上图是 Alex Krizhevsky 原论文中 AlexNet 结构图，做了简单的标注。
该模型由5个卷积层和3个全连接层构成，其中还有3个 Pooling 层。
先定义 conv block 包括卷积、BatchNormalization 和 Activation：

def Conv_block(layer, filters, kernerl_size=(3, 3), strides=(1, 1), padding="valid", name=None):
	x = Conv2D(filters=filters, kernel_size=kernel_size, strides=strides, padding=padding, name=name)(layer)
	x = BatchNormalization()(x)
	x = Activation("relu")(x)
	return x

卷积层 Conv_1

x = Conv_block(x, filters=96, kernel_size=(11, 11), strides=(4, 4), padding="valid", name="Conv_1_96_11x11_4")

池化层 Pool_1

x = MaxPool2D(pool_size=(3, 3), strides=(2, 2), padding="valid", name="maxpool_1_3x3_2")(x)

卷积层 Conv_2

x = Conv_block(x, filters=256, kernel_size=(5, 5), strides=(1, 1), padding="same", name="Conv_2_256_5x5_1")

池化层 Pool_2

x = MaxPool2D(pool_size=(3, 3), strides=(2, 2), padding="valid", name="maxpool_2_3x3_2")(x)

卷积层 Conv_3

x = Conv_block(x, filters=384, kernel_size=(3, 3), strides=(1, 1), padding="same", name="Conv_3_384_3x3_1")

卷积层 Conv_4

x = Conv_block(x, filters=384, kernel_size=(3, 3), strides=(1, 1), padding="same", name="Conv_4_384_3x3_1")

卷积层 Conv_5

x = Conv_block(x, filters=256, kernel_size=(3, 3), strides=(1, 1), padding="same", name="Conv_5_256_3x3_1")

池化层 Pool_2

x = MaxPool2D(pool_size=(3, 3), strides=(2, 2), padding="valid", name="maxpool_3_3x3_2")(x)

全连接层 FC_1

x = Flatten()(x)
x = dense(units=4096)(x)
x = BatchNormalization()(x)
x = Activation("relu")(x)

全连接层 FC_2

x = dense(units=4096)(x)
x = BatchNormalization()(x)
x = Activation("relu")(x)

全连接层 FC_3

x = dense(units=num_classes)(x)
x = BatchNormalization()(x)
x = Activation("softmax")(x)

个人理解

这里实现的时候有个trick, 输入图片大小应该是227，这样通过计算卷积之后才是55x55的大小，所以我在实现的时候先加了 zero_padding 具体参见我Github 的代码。
AlexNet 网络参数的大约是6千万个左右。
AlexNet 之后网络的**函数基本都使用 relu, 因为 relu 收敛更快，没有梯度消失问题。
池化也大多使用最大值池化 max pooling.
增加了 Batchnormalization 技术，更快收敛、类似正则化的作用，减小过拟合。
原文中还有 Dropout 技术，来防止过拟合，多用于全连接层
优化器是 SGD.
当时设置的 batch_size 为 128
采用了 group convolution 技术，也就是分块卷积，是一种模型并行的技术。受限于当时的技术资源，作者将卷积部分均分成两部分。优点是：1. 收敛快，每次可以接收更多图片，2. 参数少，3. 每一个 filter group 可以学习到数据不同的特征。

模型讲解

ImageNet Classification with Deep Convolutional Neural Networks
http://vision.stanford.edu/teaching/cs231b_spring1415/slides/alexnet_tugce_kyunghee.pdf
Understanding AlexNet
https://www.learnopencv.com/understanding-alexnet/
A Walk-through of AlexNet
https://medium.com/@smallfishbigsea/a-walk-through-of-alexnet-6cbd137a5637
AlexNet
为什么使用ReLu
http://cvml.ist.ac.at/courses/DLWT_W17/material/AlexNet.pdf
AlexNet Implementation Using Keras
https://engmrk.com/alexnet-implementation-using-keras/
AlexNet Keras Implementation
Github
https://github.com/eweill/keras-deepcv/blob/master/models/classification/alexnet.py
Plant Diseases Classification Using AlexNet
Kaggle 应用
https://www.kaggle.com/vipoooool/plant-diseases-classification-using-alexnet
Finetuning AlexNet with TensorFlow
迁移学习，fine-tune
https://kratzert.github.io/2017/02/24/finetuning-alexnet-with-tensorflow.html
AlexNet implementation + weights in TensorFlow
模型的权重，用于迁移学习
http://www.cs.toronto.edu/~guerzhoy/tf_alexnet/