Bags of Local Convolutional Features for Scalable Instance Search 论文解读


Mohedano E, Mcguinness K, O’Connor N E, et al. Bags of Local Convolutional Features for Scalable Instance Search[C]// ACM on International Conference on Multimedia Retrieval. ACM, 2016:327-331.


相关


Bags of Local Convolutional Features for Scalable Instance Search 论文解读
Bags of Local Convolutional Features for Scalable Instance Search 论文解读


1 Motivation

  • 1 当前的TRECVid 实例检索系统仍然使用基于聚合局部特征来实现,比如SIFT等,其中原因还是由于高维特征是稀疏的,容易线性可分
  • 2 时下卷积神经网络大热,而且在image retrieval 上取得不错的效果

2 Contribution

  • 1 Sparse visual representation based on a Bags of Convolutional Features, which allows fast retrieval by means of an inverted index

  • 2 Assignment map as a new compact representation of the image

  • 3 Local analysis of multiple image regions for reranking followed by query expansion using the obtained object locations


3 Pipeline

Bags of Local Convolutional Features for Scalable Instance Search 论文解读

图像特征的描述:

*输入一张图片,经过一个特定的CNN网络,得到某一卷积层响应的feature map,我们提出feature map中的Local CNN feature,并对其进行K-means 聚类,生成聚类中心;再把每一个local cnn feature 映射到聚类中心生成Assignment Map;最后使用BOW的思想统计聚类中心的词频,得到最终的图像特征描述。


4 Instance retrieval

4.1 Initial Search

  • 1 Global search(GS) :The BoW vector of the query is built with the visual words of all the local CNN features in the convolutional layer extracted for the query image.

  • 2 Local search(LS) :The BoW vector of the query contains only the visual words of the local CNN features that fall inside the query bounding box.

注:
    所谓的GS,指对query处理时,使用整张query 图片的bow特征作为查询
    所谓的LS,指对query 处理时,使用query图片中bounding box的bow特征作为查询

**初始检索时,首先使用query的BOW特征在数据库中做检索(数据库中的图片也使用pipeline中的BOW特征表示),是一种粗检索。**

4.2 Local reranking

    此论文采用W{W,W/2,W/4},H{H,H/2,H/4} 的宽高组合来划分区域。并对划分得到的区域进行筛选,筛选方法如下:

scorew=min(ARw,ARq)max(ARw,ARq)

其中:ARq=WqHq,ARw=WwHw,当某窗口得分大于某一阈值时,保留此窗口

另外本文还借鉴Spatial pyramid matching 对保留的窗口进行划分,采用了L=2的分辨率leval,即整个窗口和4个窗口子区域。分别再统计每个子窗口的BOW特征,并对不同的子窗口的bow特征赋予不同的权重。权重函数直接采用Spatial pyramid matching论文中的权重函数:

wr=12(Llr)

其中:wr 指权重,L=2,l_r指当前子窗口的分辨率

空间金字塔匹配请参考:http://blog.csdn.net/chlele0105/article/details/16972695

得到窗口的BOW特征之后,和query特征计算余弦相似度,得分最高的窗口作为最终目标的定位。

4.3 Query expansion

  • 1 Global query expansion(GQE)
    The BoW vectors of the N images at the top of the ranking are averaged together with the BoW of the query to form the new representation for the query.

    Bags of Local Convolutional Features for Scalable Instance Search 论文解读

GQE 指使用local rerank得到的前五张图片的全局特征与query特征做平均,重新生成query特征。

  • 2 Local query expansion(LQE)
    Locations obtained in the local reranking step are used to mask out the background and build the BoW descriptor of only the region of interest of the N-top images in the ranking.

LQE 指使用local rerank得到的前五张图片中定位的局部特征(例如使用上图中红色框内的BOW特征)与query特征做平均,重新生成query特征。


5 Experiments


Bags of Local Convolutional Features for Scalable Instance Search 论文解读
R 表示local reranking

Bags of Local Convolutional Features for Scalable Instance Search 论文解读
与state of art 相比

相关文章: