摘要:

本文提出了一个速度和精度权衡的detector。主要研究两个问题?
how to make the anchor-free detection head better?
how to utilize the power of feature pyramid better?
分别提出两种训练策略:soften optimization techniques,
1)soft-weighted anchor points
2)soft-selected pyramid levels

实验表明,我们简洁的SAPD将速度/精度权衡提升到了一个新的水平,COCO上获得a single-model single-scale AP of 47.4%,5xfaster than other detectors.
———————————————————————————————————————————————
目标检测分为anchor-baseanchor-free,anchor-free主要分为两大类,anchor-pointkey-point检测。

Anchor-point方法:

将bounding box根据point-to-boundary distance编解码成anchor points 这些anchor point=pixel on pyramid feature maps并具有相应的距离信息。
优点:1.框架简单 2.训练速度快 3.backbone增大有帮助 4. flexible feature level selection
缺点:same image scale ->low mAP

Key-point方法:

预测bounding box的关键点。
优点:input image size small-> high mAP
缺点:下采样不能过多,需要single high-resolution and repeat bottom-down ,down-bottom in processing.因此FLOPs大,训练时间长,计算内存消耗大,对预先练backbone兼容性差。
Soft Anchor-Point Object Detection
没有DCN(deformable convolution)5x faster
有DCN(deformable convolution)high mAP


一般的Anchor-point detector模型框架

Soft Anchor-Point Object Detection

1.Feature Pyramid 中l代表input-size*(1/sl),sl=2**l
2.Head中有两个子网络,分类(每个anchor-point输出K类的可能性)和回归(4个值)

3.Supervision targets:Plij表示在第l层,(i,j)坐标的特征, i = 0,1,…,W/sl − 1 j = 0,1,…,H/sl −1. Each plij has a corresponding image space location (Xlij,Ylij) where Xlij = sl(i + 0.5) and Ylij = sl(j + 0.5)。
正样本判断:ground-truth B = (c,x,y,w,h),中心收缩Bv = (c,x,y,ew,eh), 只有在BV里面才是positive。
最后归一化的left,top,right,behind的坐标为:

Soft Anchor-Point Object Detection

  1. Loss functions :focal-loss 分类,IOU-loss 回归

Soft Anchor-Point Object Detection
Soft Anchor-Point Object Detection


两个创新策略

1.SW(soft-weight )

Soft Anchor-Point Object DetectionSoft Anchor-Point Object DetectionSoft Anchor-Point Object Detection
权重公式:yita控制下降步伐。Soft Anchor-Point Object Detection


2.ss(soft-select)Soft Anchor-Point Object Detection

Soft Anchor-Point Object DetectionSoft Anchor-Point Object DetectionSoft Anchor-Point Object DetectionSoft Anchor-Point Object Detection
Instance-dependent经过每层level->ROI Align->concat->经过meta-selection network-> a vector

The meta-selection network is jointly trained with the detector. Cross entropy loss is used for optimization and the ground-truth is a one hot vector indicating which pyramid level has minimal loss as defined in the FSAF moduleSoft Anchor-Point Object Detection
Soft Anchor-Point Object DetectionSoft Anchor-Point Object Detection

LossSoft Anchor-Point Object Detection

———————————————————————————————————————————————

消融实验

Soft Anchor-Point Object DetectionSoft Anchor-Point Object DetectionSoft Anchor-Point Object DetectionSoft Anchor-Point Object DetectionSoft Anchor-Point Object Detection
结果表明,对于金字塔级别更高的实例,更大的实例往往被赋予更高的权重。大多数事例不超过两层就可以学会。非常罕见的实例需要超过两层,例如图7右上角的沙发子图。这与表4中的结果一致。

Joint training of the meta-selection network has a negligible effect on performance.
有人认为是因为多任务训练产生优异的表现,但是本文只是用训练进行联合训练,验证并没有使用权重,换句话说特征选择策略 Is the same as the baseline FSAF module
Soft Anchor-Point Object Detection
SAPD is robust and efficient
修改backbone很容易提高AP,AR
并且和anchor-base方法进行比较,实验表明
SAPD不仅跑的快(因为head 结构简单)
并且精准率高(即使是anchor-base和anchor-free方法结合
by significant margins)

Without DCN, our fastest SAPD version based on ResNet-50 can reach a 14.9 FPS while maintaining a 41.7% AP
With DCN, our SAPD forms an upper envelope of recent state-of-the-art anchor-based and anchor-free detectors

———————————————————————————————————————————————

补充

FSAFSoft Anchor-Point Object Detection

相关文章: