「Computer Vision」Notes on Associative Embedding

Sina Weibo：小锋子Shawn
Tencent E-mail：[email protected]
http://blog.csdn.net/dgyuanshaofeng/article/details/82075612

Associative Embedding[1]被大量高质量会议论文引用。

1 介绍

作者开头提到许多计算机视觉任务的基础是检测和组合：”detecting smaller visual units and grouping them into larger structures”。以多人姿态估计、个例分割和多目标追踪为例子。

2 相关工作

Top-down方法，首先检测单人（individual people），然后估计每个人的姿态。比如，RMPE[2]，Mask R-CNN[3]，detector+estimator[4]。
Bottom-up方法，首先检测单人体关节（individual body joints），然后组合这些关节（group）。比如，PAF[5]，DeepCut[6]，DeeperCut[7]， Local Joint-to-Person Associations[8]。

3 方法

3.1 网络架构

使用了修改版的堆叠漏斗网络[9]。

3.2 检测和组合

方法概述如图1所示。

「Computer Vision」Notes on Associative Embedding

图1 概述

组合损失（grouping loss）如下：

L_{g}

，
前者为拉近损失，后者为推远损失。

3.3 解析网络输出

4 实验

数据集为MS-COCO何MPII。

5 结论

[1] Associative Embedding End-to-End Learning for Joint Detection and Grouping NIPS 2017
[2] RMPE Regional Multi-person Pose Estimation ICCV 2017
[3] Mask R-CNN ICCV 2017
[4] Towards Accurate Multi-person Pose Estimation in the Wild CVPR 2017
[5] Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields CVPR 2017
[6] DeepCut Joint Subset Partition and Labeling for Multi Person Pose Estimation CVPR 2016
[7] DeeperCut A Deeper, Stronger, and Faster Multi-person Pose Estimation Model ECCV 2016
[8] Multi-person Pose Estimation with Local Joint-to-Person Associations ECCV 2016
[9] Stacked Hourglass Networks for Human Pose Estimation ECCV 2016