文章链接:https://arxiv.org/abs/2004.00280
这篇文章想法比较有意思,基于知识蒸馏(Knowledge Distillation),采用一个无监督算法的输出信息,来指导一个有监督算法。
文章中无监督算法采用UGACH(Unsupervised Generative Adversarial Cross-Modal Hashing),监督算法采用DCMH(Deep Cross-Modal Hashing),无监督算法的输出信息为相似性矩阵Si,j,算法模型图如下
![[论文笔记]Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing [论文笔记]Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing](/default/index/img?u=L2RlZmF1bHQvaW5kZXgvaW1nP3U9YUhSMGNITTZMeTl3YVdGdWMyaGxiaTVqYjIwdmFXMWhaMlZ6THpnd015OHpNRGN5WTJNMk5qUTVOR1E1TVRoaFl6UTVOMkkwWVdVMU5XUXhPVEJoTXk1d2JtYz0=)
与UGACH算法中采用k邻近思想来确定相似度矩阵的方法不同,本文采用特征性向量的欧式距离来确定相似度矩阵,文章中尝试了几种不同的算法如下:
![[论文笔记]Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing [论文笔记]Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing](/default/index/img?u=L2RlZmF1bHQvaW5kZXgvaW1nP3U9YUhSMGNITTZMeTl3YVdGdWMyaGxiaTVqYjIwdmFXMWhaMlZ6THpRd05pOWtPV0kwWmpJNU9EWmpZVEZtTTJGak1qSm1Nems0WVROak1EUmtOek0yWlM1d2JtYz0=)
-
viI为原始图片特征向量,viT为原始文本向量
-
fiI为经过神经网络得到的图片向量,fiT为经过神经网络得到的文本向量
由于相似性矩阵不能直接求得,使得在原始的目标函数中要引入新的一项
θI,⋆,θT,⋆=argθ1,θTmin=i,j∑Si,j⋅∣∣fiI−fjT∣∣