论文:《Going Deeper with Convolutions》 Google Inc.
摘要:GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

Inception 深层卷积神经网络架构

Introduction
物体检测方面的最大收获是来自深度学习框架和经典计算机视觉协同。
之所以在分类和检测方面提出GoogLeNet是由于深度学习和卷积网络的能力的迅速发展,以及硬件设备的更新支持。

15亿次计算使得它不仅只是学术研究,也可以应用于实际。

deep深度含义:层次加深和逻辑思路的提高。

Inception采用的1*1 卷积层,目的是降维模块,消除计算瓶颈。

面对的2个问题:多参数过拟合,计算量增大
解决方法:引入稀松性,随机失活。

Architectural Details
Inception体系结构的主要思想是考虑如何通过容易获得的密集组件来近似和覆盖卷积视觉网络的最佳局部稀疏结构。

Inception原始结构和降维结构。
研一汇报第六周(待完成)
GoogLeNet
The exact structure of the extra network on the side, in- cluding the auxiliary classifier, is as follows:
• An average pooling layer with 5×5 filter size and stride 3, resulting in an 4×4×512 output for the (4a), and 4×4×528 for the (4d) stage.
• A 1×1 convolution with 128 filters for dimension re- duction and rectified linear activation.
• A fully connected layer with 1024 units and rectified linear activation.
• A dropout layer with 70% ratio of dropped outputs.
• A linear layer with softmax loss as the classifier (pre- dicting the same 1000 classes as the main classifier, but removed at inference time).研一汇报第六周(待完成)

研一汇报第六周(待完成)研一汇报第六周(待完成)
研一汇报第六周(待完成)

ILSVRC 2014 Classification Challenge Setup and Results
研一汇报第六周(待完成)
ILSVRC 2014 Detection Challenge Setup and Results

使用边框回归器进行预训练。
表格4 多模型状态排名
表格5 单模型
研一汇报第六周(待完成)
In Table 5, we compare results using a single model only. The top performing model is by Deep Insight and surpris- ingly only improves by 0.3 points with an ensemble of 3 models while the GoogLeNet obtains significantly stronger results with the ensemble.
研一汇报第六周(待完成)

Conclusions
通过随时可用的密集构建块来预期的最佳稀疏结构是一种改善计算机视觉神经网络的方法。

相关文章: