CVPR 2020 的一篇自注意力机制
【论文笔记】CVPR2020 Exploring Self-attention for Image Recognition

Contributions

  • explore variations of self-attention and assess their effectiveness for image recognition; 按两类self-attention进行探讨:pairwise self-attention & patchwise self-attention

  • 主要结论:
    【论文笔记】CVPR2020 Exploring Self-attention for Image Recognition

Methods

  • Pairwise Self-attention
    【论文笔记】CVPR2020 Exploring Self-attention for Image Recognition
    乘在beta(xj)上的weight只由xi,xj决定。

    可以通过加position encoding让网络知晓xi,xj的位置关系。

  • Patch Self-attention
    【论文笔记】CVPR2020 Exploring Self-attention for Image Recognition
    乘在beta(xj)上的weight是由整个batch R(i) (batch里所有的 j locations) 决定的。这和convolution有些类似,都可以uniquely identify specific locations within their footprint,但却更加强大。

  • vectorial attention operators vs. standard scalar attention

    【论文笔记】CVPR2020 Exploring Self-attention for Image Recognition

    文章特意强调了传统attention用dot product算scalar attention的不足 (例如transformer)(所有channels share同样的weight):
    【论文笔记】CVPR2020 Exploring Self-attention for Image Recognition
    作者通过使用summation, subtraction, Hadamard, concatenation计算pairwise-attention,来生成vector output,最终的attention vector可以每个channel都能assign一个独特的weight。

Results

在accuracy和cost方面,都是patchwise更好。
【论文笔记】CVPR2020 Exploring Self-attention for Image Recognition
还做了很多其他的实验,例如transformation function,non-linear function, spatial encoding对结果的影响。

相关文章: