End-to-end comparative attention networks for person re-identification

该论文中简单的总结一下关于attention的部分

整体网络架构如下:

End-to-end comparative attention networks for person re-identification
CNN部分采用截断的预训练VGG Net,去除最后三层全连接层,得到特征X
H为网络的输出向量(具体如何得到,后文有解释)
注意网络分支共享参数
使用端到端的学习方式,优化下面提到的多任务loss函数
多重任务loss函数:
三元loss函数:
End-to-end comparative attention networks for person re-identification
End-to-end comparative attention networks for person re-identification
分类的softmax loss函数:
End-to-end comparative attention networks for person re-identification
End-to-end comparative attention networks for person re-identification
多任务loss函数:
End-to-end comparative attention networks for person re-identification

Attention组件部分:

单个时间步的LSTM单元:
End-to-end comparative attention networks for person re-identification
X_t表示时间步为t的特征图,但是每一个时间步的X_t采用的是相同的特征图X,即从CNN中直接得到的特征图
l_(t-1)表示根据h_(t-1)产生的attention map,W_(i,h)表示权值参数,和LSTM的参数采用端到端的训练方式一起学习。
End-to-end comparative attention networks for person re-identification
上下文向量:
End-to-end comparative attention networks for person re-identification
包含多个时间步的LSTM网络的attention组件:
End-to-end comparative attention networks for person re-identification
(其中h_0和c_0采用两层感知机预初始化)
End-to-end comparative attention networks for person re-identification
Concatenation层选取m个隐状态h_i整合
End-to-end comparative attention networks for person re-identification
由于整个网络比较复杂,loss函数波动较大,最后对R采用L2规则化

End-to-end comparative attention networks for person re-identification

目前不理解的部分:

1)既然网络分支共享参数,那么如何解决三张图片关注的区域物理位置不同?(根据文章中的图片,没有这样的问题,应该是我没有理解)


感谢 Liu H, Feng J, Qi M, et al. End-to-end comparative attention networks for person re-identification[J]. IEEE Transactions on Image Processing, 2017, 26(7): 3492-3506.

相关文章:

  • 2021-11-12
  • 2021-05-08
  • 2022-03-08
  • 2021-07-03
  • 2021-07-16
  • 2022-12-23
  • 2022-12-23
猜你喜欢
  • 2021-12-28
  • 2021-05-29
  • 2021-12-24
  • 2021-12-13
  • 2021-07-22
  • 2021-10-10
相关资源
相似解决方案