Notes: 谷歌deeplabv3+的代码现在已经开源,详见deeplab(Github),还有一个使用的demo样例。
0.
spatial pyramid pooling
- probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view
- encode multi-scale contextual information
encode-decoder
- gradually recovering the spatial information
- capture sharper object boundaries
convolution
- depthwise convolution: a spatial convolution performed independently over each channel of an input
- pointwise convolution: a 1x1 convolution, projecting the channels output by the depthwise convolution onto a new channel space
- differences:
- Inception: 1x1 conv first
- depthvise separable convolution: channel-wise first
details can be found here.
contributes
- decoder module: refine the segmentation results especially along object boundaries
- depth-wise separable convolution
- Atrous Spatial Pyramid Pooling
- decoder
3. Methods
- capture multi-scale context