Spatio-Temporal Attention-Based LSTM Networks for 3D Action Recognition and Detection
关键词:LSTM\spatio-temporal\attention model\action recognition&detection
针对问题:将RGB video-based中曾用过的attention model移植到skeleton-based data中
相关工作:
RGB-based methods:易失3D信息
skeleton-based methods:
Kinect,易失表面信息
actionlet线性组合、加入成对关节距离(低程度的非线性)、连接矩阵中加入稀疏约束、adaboost选取关键帧
action recognition:
空间attention model:joint-selection gates----分配joints重要性----作用于输入端
时间attention model:frame-selection gates----分配frame的attention----作用于输出端
以上两种gates模仿了LSTM中的sigmond函数构成的门
时空域特征结合:1 空间和时间和主网络三个网络结合
2 损失函数的正则化中体现了时空模型的两个参数
action detection:
与滑动窗口不同,选择action proposal based on temporal model 9个candidates选一
IOU
(NMS)非极大值抑制,candidates融合
去噪。
待读文献:
Real-time human pose recognition in parts from single depth images
Multi-region two-stream R-CNN for action detection
Neural machine translation by jointly learning to align and translate