2019_NIPS_Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation

https://paperswithcode.com/sota/synthetic-to-real-translation-on-gtav-to

在相关GTA5→Cityscapes and SYNTHIA→Cityscapes scenarios任务上，取得最优结果。

摘要

虽然在匹配两个域之间的边缘分布（marginal distributions）已经取得进展，但是分类器

偏向源域特征，由于与类别无关的特征的对齐分类器对目标域作出错误的预测。

作者提出一个最新的category anchor导向的语义分割的无监督域适应模型，显示地强制类别感知特征对齐类来同时学习共享的判别特征和分类器。

首先，源域特征的类别化中心点作为引导锚，用于识别目标域的活动特征，同时给它们分配伪标签。

然后，作者利用一个基于锚的像素层次距离损失，和一个判别损失得到一个类内（intrs-category）特征近，类间特征更远。

最终，作者设计一个阶段性设计方法，以减少训练误差，并逐步调整所提出的模型。

术语：

Domain Discrepancy 域差异性， Domain-invariant Representation 域不变表征，domain-invariant feature

Introduction

之前已经有学者提出学习域不变表征通过匹配源域和目标域之间的分布在appearance level, feature levle, output level。

之前已经有学者提出在appearn level, feature levle, output level 层次匹配目标域和源域之间的分布，实现学习域不变表征。

appearance level:[27, 34, 40, 13, 21]

feature level:[14, 27, 3, 13],

output level:[45, 36, 26].

然而，即使匹配全局的边缘分布（global marginal distribution）能够使两个域更接近，例如，实现一个更低的最大均值差异，或者通过对抗学习达到一个鞍点（saddle point）。但是这些方法不能保证来自不同类的样本被适当分离，从而影响泛化能力。为了解决这个问题，我们可以考虑通过匹配特征和类别的局部联合分布来进行类别感知特征对齐[7, 19, 33].。其他方法采用了自训练的思想，通过为目标域中样本生成伪标签，并为分类器提供额外的监督信息[47, 21, 3].。加上来自源域的监督，这就强制网络同时学习域不变特征判别特征表示和通过反向传播的共享决策边界。最小化输出交叉熵【39】，或者最小化两个分类器的输出【26】，的想法隐式地强制类别对齐。

虽然类层次对齐，自训练方法已经得到不错的结果，但是仍然存在突出的问题需要解决，改善适应能力。例如，错误倾向的伪标签将会误导分类器，并加速错误。同时，隐式的类标签对齐可能会收到类不平衡影响。为了处理这些问题，并利用这两类方法的优点，作者提出一个最新的类锚（category anchor）思想, 促进类别特征对齐和自我训练。作者受到相同类的特征往往会聚集在一起的观察启发，在每个类中源域特征的质心可以作为指域适应的显示锚定。

作者提出的模型显示地强迫类对应特征对齐（category-wise feature alignment）,进而同时为两个域学习共享特征表征和分类器。

首先，源域特征类对应特征的质心被当作锚用来确定目标域中活动特征（active feature）。

然后，作者根据最近锚的类别为活动特征指定伪造标签。

最后，作者提出两个损失函数：

第一个损失函数是在引导锚（guiding anchors ）和活动特征之间的像素层次距离损失（pixel-level distance loss），使引导锚与活动特征之间的距离更近，并显示地最小化内部类别特征方差。

第二个损失函数是像素层次的判别损失（pixel-level discriminative loss）用于监督分类器，并最大化类见特征方差（inter-category featrure variance）。

为了错误标签的错误累积，作者提出阶段性的训练机制来逐步适应这个模式。

方法简称：CAG-UDA

参考文献：

FL-[3] C. Chen, W. Xie, T. Xu, W. Huang, Y. Rong, X. Ding, Y. Huang, and J. Huang. Progressive feature alignment for unsupervised domain adaptation. arXiv preprint arXiv:1811.08585, 2018.

FL-[13] J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. Efros, and T. Darrell. Cycada: Cycle-consistent adversarial domain adaptation. In International Conference on Machine Learning (ICML), 2018.

FL- [14] J. Hoffman, D. Wang, F. Yu, and T. Darrell. Fcns in the wild: Pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649, 2016.

[21] Y. Li, L. Yuan, and N. Vasconcelos. Bidirectional learning for domain adaptation of semantic segmentation. arXiv preprint arXiv:1904.10620, 2019.

[26] Y. Luo, L. Zheng, T. Guan, J. Yu, and Y. Yang. Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. arXiv preprint arXiv:1809.09478, 2018.

[27] Z. Murez, S. Kolouri, D. Kriegman, R. Ramamoorthi, and K. Kim. Image to image translation for domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),

[34] S. Sankaranarayanan, Y. Balaji, A. Jain, S. Nam Lim, and R. Chellappa. Learning from synthetic data:

Addressing domain shift for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3752–3761, 2018.

[36] Y.-H.Tsai,W.-C.Hung,S.Schulter,K.Sohn,M.-H.Yang,andM.Chandraker.Learningtoadaptstructured

output space for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition (CVPR), pages 7472–7481, 2018.

[39] T.-H.Vu,H.Jain,M.Bucher,M.Cord,andP.Pérez.Advent:Adversarial entropy minimization for domain

adaptation in semantic segmentation. arXiv preprint arXiv:1811.12833, 2018.

[40] Z. Wu, X. Han, Y.-L. Lin, M. Gokhan Uzunbas, T. Goldstein, S. Nam Lim, and L. S. Davis. Dcan: Dual channel-wise alignment networks for unsupervised scene adaptation. In Proceedings of the European

Conference on Computer Vision (ECCV), pages 518–534, 2018.

[45] Y.Zhang,P.David,andB.Gong.Curriculum domain adaptation for semantic segmentation of urbanscenes.

In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2020–2030, 2017.