Photoshop中磁力套索的一种简陋实现(基于Python)

经常用Photoshop的人应该熟悉磁力套索(Magnetic Lasso)这个功能，就是人为引导下的抠图辅助工具。在研发领域一般不这么叫，通常管这种边缘提取的办法叫Intelligent Scissors或者Livewire。

本来是给一个图像分割项目算法评估时的Python框架，觉得有点意思，就稍稍拓展了一下，用PyQt加了个壳，在非常简陋的程度上模拟了一下的磁力套索功能。为什么简陋：1) 只实现了最基本的边缘查找。路径冷却，动态训练，鼠标位置修正都没有，更别提曲线闭合，抠图，Alpha Matting等等；2) 没考虑性能规范，只为了书写方便；3) 我对Qt了解很浅，至今不会写Signal-Slot，不知道GUI写得是否合理；4) 没调试。

基本算法

相关算法我并没有做很深入的调研，不过相信这类应用中影响力最大的算法是来源于[1]，也是本文的主要参考，基本思想是把图片看成是一个无向图，相邻像素之间就可以计算出一个局部cost，于是就转化成了最短路径问题了，接下来就是基于Dijkstra算法产生路径，就是需要提取的边缘。主要涉及的算法有两部分：1) 相邻像素的cost计算；2) 最短路径算法。

边缘检测

计算相邻像素cost的最终目的还是为了寻找边缘，所以本质还是边缘检测。基本思想是，通过各种不同手段检测边缘，并且根据检测到的强度来求加权值，作为cost。从最短路径的角度来说，就是边缘越明显的地方，cost的值越小。[1]中的建议是用三种指标求加权：1) 边缘检测算子；2) 梯度强度(Gradient Magnitude)；3) 梯度方向(Gradient Direction)。本文的方法和[1]有那么一些不一样，因为懒，用了OpenCV中的Canny算子检测边缘而不是Laplacian Zero-Crossing Operator。表达式如下：

\[l\left( p,q \right)={{w}_{E}}{{f}_{E}}\left( q \right)+{{w}_{G}}{{f}_{G}}\left( q \right)+{{w}_{D}}{{f}_{D}}\left( p,q \right)\]

Canny算子

基本思想是根据梯度信息，先检测出许多连通的像素，然后对于每一坨连通的像素只取其中最大值且连通的部分，将周围置零，得到初始的边缘(Edges)，这个过程叫做Non-Maximum Suppression。然后用二阈值的办法将这些检测到的初始边缘分为Strong, Weak, and None三个等级，顾名思义，Strong就是很确定一定是边缘了，None就被舍弃，然后从Weak中挑选和Strong连通的作为保留的边缘，得到最后的结果，这个过程叫做Hysteresis Thresholding。这个算法太经典，更多细节一Google出来一大堆，我就不赘述了。公式如下：

\[{{f}_{E}}\left( q \right)=\left\{ \begin{matrix}
0;\text{ if }q\text{ is on a edge} \\
1;\text{ if }q\text{ is not on a edge} \\
\end{matrix} \right.\]

其实从权值的计算上和最大梯度有些重复，因为如果沿着最大梯度方向找出来的路径基本上就是边缘，这一项的作用我的理解主要应该是1) 避免梯度都很大的区域出现离明显边缘的偏离；2) 保证提取边缘的连续性，一定程度上来讲也是保证平滑。

梯度强度

就是梯度求模而已，x和y两个方向的梯度值平方相加在开方，公式如下：

\[{{I}_{G}}\left( q \right)=\sqrt{{{I}_{x}}\left( q \right)+{{I}_{y}}\left( q \right)}\]

因为要求cost，所以反向并归一化：

\[{{f}_{G}}\left( q \right)=1-\frac{{{I}_{G}}\left( q \right)}{\max \left( {{I}_{G}} \right)}\]

梯度方向

这一项其实是个平滑项，会给变化剧烈的边缘赋一个比较高的cost，让提取的边缘避免噪声的影响。具体公式如下：

\[{{f}_{D}}\left( p,q \right)=\frac{2}{3\pi }\left( \arccos \left( {{d}_{p}}\left( p,q \right) \right)+\arccos \left( {{d}_{q}}\left( p,q \right) \right) \right)\]

其中，

\[{{d}_{p}}\left( p,q \right)=\left\langle {{d}_{\bot }}\left( p \right),{{l}_{D}}\left( p,q \right) \right\rangle \]

\[{{d}_{q}}\left( p,q \right)=\left\langle {{l}_{D}}\left( p,q \right),{{d}_{\bot }}\left( q \right) \right\rangle \]

\[{{l}_{D}}\left( p,q \right)=\left\{ \begin{matrix}
q-p;\text{ if }\left\langle {{d}_{\bot }}\left( p \right),q-p \right\rangle \ge 0 \\
p-q;\text{ if }\left\langle {{d}_{\bot }}\left( p \right),q-p \right\rangle <0 \\
\end{matrix} \right.\]

\({{d}_{\bot }}\left( p \right)\)是取p的垂直方向，另外注意上式中符号的判断会将\({{d}_{\bot }}\left( p \right)\)和\({{l}_{D}}\left( p,q \right)\)的取值限制在π/2以内。

\[{{d}_{\bot }}\left( p \right)=\left( {{p}_{y}},-{{p}_{x}} \right)\]

斜对角方向的cost修正

在二维图像中，相邻的像素通常按照间隔欧式距离分为两种：1) 上下左右相邻，间隔为像素边长；2) 斜对角相邻，间隔为像素边长的\(\sqrt{2}\)倍。在计算局部cost的时候通常要把这种距离差异的影响考虑进去，比如下面这幅图：

2	3	4
5	6	6
7	8	9

如果不考虑像素位置的影响，那么查找最小cost的时候会认为左上角的cost=2最小。然而如果考虑到像素间距的影响，我们来看左上角方向，和中心的差异是6-2，做个线性插值的话，则左上角距中心单位距离上的值应该为\(6-\left( 6-2 \right)\times 1/\sqrt{2}\ =3.17>3\)，所以正上方的才是最小cost的正确方向。

最短路径查找

在磁力套索中，一般的用法是先单击一个点，然后移动鼠标，在鼠标和一开始单击的点之间就会出现自动贴近边缘的线，这里我们定义一开始单击的像素点为种子点(seed)，而磁力套索其实在考虑上部分提到的边缘相关cost的情况下查找种子点到当前鼠标的最短路径。如下图，红色的就是种子点，而移动鼠标时，最贴近边缘的种子点和鼠标坐标的连线就会实时显示，这也是为什么磁力套索也叫Livewire。

Photoshop中磁力套索的一种简陋实现(基于Python)

实现最短路径的办法很多，一般而言就是动态规划了，这里介绍的是基于Dijkstra算法的一种实现，基本思想是，给定种子点后，执行Dijkstra算法将图像的所有像素遍历，得到每个像素到种子点的最短路径。以下面这幅图为例，在一个cost矩阵中，利用Dijkstra算法遍历每一个元素后，每个元素都会指向一个相邻的元素，这样任意一个像素都能找到一条到seed的路径，比如右上角的42和39对应的像素，沿着箭头到了0。

Photoshop中磁力套索的一种简陋实现(基于Python)

算法如下：

输入:
s              // 种子点
l(q,r)         // 计算局部cost

数据结构:
L             // 当前待处理的像素
N(q)          // 当前像素相邻的像素
e(q)          // 标记一个像素是否已经做过相邻像素展开的Bool函数
g(q)          // 从s到q的总cost

输出:
p             // 记录所有路径的map

算法:
g(s)←0; L←s; // 将种子点作为第一点初始化
while L≠Ø:                 // 遍历尚未结束
    q←min(L);                // 取出最小cost的像素并从待处理像素中移除
    e(q)←TRUE;                 // 将当前像素记录为已经做过相邻像素展开
    for each r∈N(q) and not e(r):
      g_temp←g(q)+l(q,r);     // 计算相邻像素的总cost
      if r∈L and g_temp<g(r):    // 找到了更好的路径
        r←L; { from list.}     // 舍弃较大cost的路径
      else if r∉L:
        g(r)←g_temp;             // 记录当前找到的最小路径
        p(r)←q;
        L←r;                    // 加入待处理以试图寻找更短的路径

遍历的过程会优先经过cost最低的区域，如下图：

Photoshop中磁力套索的一种简陋实现(基于Python)

所有像素对应的到种子像素的最短路径都找到后，移动鼠标时就直接画出到seed的最短路径就可以了。

Python实现

算法部分直接调用了OpenCV的Canny函数和Sobel函数(求梯度)，对于RGB的处理也很简陋，直接用梯度最大的值来近似。另外因为懒，cost map和path map都直接用了字典(dict)，而记录展开过的像素则直接采用了集合(set)。GUI部分因为不会用QThread所以用了Python的threading，只有图像显示交互区域和状态栏提示，左键点击设置种子点，右键结束，已经提取的边缘为绿色线，正在提取的为蓝色线。

Photoshop中磁力套索的一种简陋实现(基于Python)

代码

算法部分

  1 from __future__ import division
  2 import cv2
  3 import numpy as np
  4 
  5 SQRT_0_5 = 0.70710678118654757
  6 
  7 class Livewire():
  8     """
  9     A simple livewire implementation for verification using 
 10         1. Canny edge detector + gradient magnitude + gradient direction
 11         2. Dijkstra algorithm
 12     """
 13     
 14     def __init__(self, image):
 15         self.image = image
 16         self.x_lim = image.shape[0]
 17         self.y_lim = image.shape[1]
 18         # The values in cost matrix ranges from 0~1
 19         self.cost_edges = 1 - cv2.Canny(image, 85, 170)/255.0
 20         self.grad_x, self.grad_y, self.grad_mag = self._get_grad(image)
 21         self.cost_grad_mag = 1 - self.grad_mag/np.max(self.grad_mag)
 22         # Weight for (Canny edges, gradient magnitude, gradient direction)
 23         self.weight = (0.425, 0.425, 0.15)
 24         
 25         self.n_pixs = self.x_lim * self.y_lim
 26         self.n_processed = 0
 27     
 28     @classmethod
 29     def _get_grad(cls, image):
 30         """
 31         Return the gradient magnitude of the image using Sobel operator
 32         """
 33         rgb = True if len(image.shape) > 2 else False
 34         grad_x = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=3)
 35         grad_y = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=3)
 36         if rgb:
 37             # A very rough approximation for quick verification...
 38             grad_x = np.max(grad_x, axis=2)
 39             grad_y = np.max(grad_y, axis=2)
 40             
 41         grad_mag = np.sqrt(grad_x**2+grad_y**2)
 42         grad_x /= grad_mag
 43         grad_y /= grad_mag
 44         
 45         return grad_x, grad_y, grad_mag
 46     
 47     def _get_neighbors(self, p):
 48         """
 49         Return 8 neighbors around the pixel p
 50         """
 51         x, y = p
 52         x0 = 0 if x == 0 else x - 1
 53         x1 = self.x_lim if x == self.x_lim - 1 else x + 2
 54         y0 = 0 if y == 0 else y - 1
 55         y1 = self.y_lim if y == self.y_lim - 1 else y + 2
 56         
 57         return [(x, y) for x in xrange(x0, x1) for y in xrange(y0, y1) if (x, y) != p]
 58     
 59     def _get_grad_direction_cost(self, p, q):
 60         """
 61         Calculate the gradient changes refer to the link direction
 62         """
 63         dp = (self.grad_y[p[0]][p[1]], -self.grad_x[p[0]][p[1]])
 64         dq = (self.grad_y[q[0]][q[1]], -self.grad_x[q[0]][q[1]])
 65         
 66         l = np.array([q[0]-p[0], q[1]-p[1]], np.float)
 67         if 0 not in l:
 68             l *= SQRT_0_5
 69         
 70         dp_l = np.dot(dp, l)
 71         l_dq = np.dot(l, dq)
 72         if dp_l < 0:
 73             dp_l = -dp_l
 74             l_dq = -l_dq
 75         
 76         # 2/3pi * ...
 77         return 0.212206590789 * (np.arccos(dp_l)+np.arccos(l_dq))
 78     
 79     def _local_cost(self, p, q):
 80         """
 81         1. Calculate the Canny edges & gradient magnitude cost taking into account Euclidean distance
 82         2. Combine with gradient direction
 83         Assumption: p & q are neighbors
 84         """
 85         diagnol = q[0] == p[0] or q[1] == p[1]
 86         
 87         # c0, c1 and c2 are costs from Canny operator, gradient magnitude and gradient direction respectively
 88         if diagnol:
 89             c0 = self.cost_edges[p[0]][p[1]]-SQRT_0_5*(self.cost_edges[p[0]][p[1]]-self.cost_edges[q[0]][q[1]])
 90             c1 = self.cost_grad_mag[p[0]][p[1]]-SQRT_0_5*(self.cost_grad_mag[p[0]][p[1]]-self.cost_grad_mag[q[0]][q[1]])
 91             c2 = SQRT_0_5 * self._get_grad_direction_cost(p, q)
 92         else:
 93             c0 = self.cost_edges[q[0]][q[1]]
 94             c1 = self.cost_grad_mag[q[0]][q[1]]
 95             c2 = self._get_grad_direction_cost(p, q)
 96         
 97         if np.isnan(c2):
 98             c2 = 0.0
 99         
100         w0, w1, w2 = self.weight
101         cost_pq = w0*c0 + w1*c1 + w2*c2
102         
103         return cost_pq * cost_pq
104 
105     def get_path_matrix(self, seed):
106         """
107         Get the back tracking matrix of the whole image from the cost matrix
108         """
109         neighbors = []          # 8 neighbors of the pixel being processed
110         processed = set()       # Processed point
111         cost = {seed: 0.0}      # Accumulated cost, initialized with seed to itself
112         paths = {}
113 
114         self.n_processed = 0
115         
116         while cost:
117             # Expand the minimum cost point
118             p = min(cost, key=cost.get)
119             neighbors = self._get_neighbors(p)
120             processed.add(p)
121 
122             # Record accumulated costs and back tracking point for newly expanded points
123             for q in [x for x in neighbors if x not in processed]:
124                 temp_cost = cost[p] + self._local_cost(p, q)
125                 if q in cost:
126                     if temp_cost < cost[q]:
127                         cost.pop(q)
128                 else:
129                     cost[q] = temp_cost
130                     processed.add(q)
131                     paths[q] = p
132             
133             # Pop traversed points
134             cost.pop(p)
135             
136             self.n_processed += 1
137         
138         return paths

livewire.py