因为之前对比了RoI pooling的几种实现,发现python、pytorch的自带工具函数速度确实很慢,所以这里再对Faster-RCNN中另一个速度瓶颈NMS做一个简单对比试验。
这里做了四组对比试验,来简单验证不同方法对NMS速度的影响。
方法1:纯python语言实现:简介方便、速度慢
方法2:直接利用Cython模块编译
方法3:先将全部变量定义为静态类型,再利用Cython模块编译
方法4:在方法3的基础上再加入cuda加速模块, 再利用Cython模块编译,即利用gpu加速
一. 几点说明
1. 简单说明Cython:
Cython是一个快速生成Python扩展模块的工具,从语法层面上来讲是Python语法和C语言语法的混血,当Python性能遇到瓶颈时,Cython直接将C的原生速度植入Python程序,这样使Python程序无需使用C重写,能快速整合原有的Python程序,这样使得开发效率和执行效率都有很大的提高,而这些中间的部分,都是Cython帮我们做了。
2. 简单介绍NMS:
Faster-RCNN中有两处使用NMS,第一处是训练+预测的时候,利用ProposalCreator来生成proposal的时候,因为只需要一部分proposal,所以利用NMS进行筛选。第二处使用是预测的时候,当得到300个分类与坐标偏移结果的时候,需要对每个类别逐一进行非极大值抑制。也许有人问为什么对于每个类别不直接取置信度最高的那一个?因为一张图中某个类别可能不止一个,例如一张图中有多个人,直接取最高置信度的只能预测其中的一个人,而通过NMS理想情况下可以使得每个人(每类中的每个个体)都会有且仅有一个bbox框。
二. 四种方法实现
1. 纯python实现:nms_py.py
#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Mon May 7 21:45:37 2018 @author: lps """ import numpy as np boxes=np.array([[100,100,210,210,0.72], [250,250,420,420,0.8], [220,220,320,330,0.92], [100,100,210,210,0.72], [230,240,325,330,0.81], [220,230,315,340,0.9]]) def py_cpu_nms(dets, thresh): # dets:(m,5) thresh:scaler x1 = dets[:,0] y1 = dets[:,1] x2 = dets[:,2] y2 = dets[:,3] areas = (y2-y1+1) * (x2-x1+1) scores = dets[:,4] keep = [] index = scores.argsort()[::-1] while index.size >0: i = index[0] # every time the first is the biggst, and add it directly keep.append(i) x11 = np.maximum(x1[i], x1[index[1:]]) # calculate the points of overlap y11 = np.maximum(y1[i], y1[index[1:]]) x22 = np.minimum(x2[i], x2[index[1:]]) y22 = np.minimum(y2[i], y2[index[1:]]) w = np.maximum(0, x22-x11+1) # the weights of overlap h = np.maximum(0, y22-y11+1) # the height of overlap overlaps = w*h ious = overlaps / (areas[i]+areas[index[1:]] - overlaps) idx = np.where(ious<=thresh)[0] index = index[idx+1] # because index start from 1 return keep import matplotlib.pyplot as plt def plot_bbox(dets, c='k'): x1 = dets[:,0] y1 = dets[:,1] x2 = dets[:,2] y2 = dets[:,3] plt.plot([x1,x2], [y1,y1], c) plt.plot([x1,x1], [y1,y2], c) plt.plot([x1,x2], [y2,y2], c) plt.plot([x2,x2], [y1,y2], c) plt.title("after nms") plot_bbox(boxes,'k') # before nms keep = py_cpu_nms(boxes, thresh=0.7) plot_bbox(boxes[keep], 'r')# after nms