使用 Matlab 进行图形切割答案

【问题标题】：Graph cut using Matlab使用 Matlab 进行图形切割
【发布时间】：2014-11-16 13:13:14
【问题描述】：

我正在尝试为我的分割任务应用图形切割方法。我在以下位置找到了一些示例代码 Graph_Cut_Demo。

部分代码如下所示

img = im2double( imread([ImageDir 'cat.jpg']) ); 
[ny,nx,nc] = size(img); 
d = reshape( img, ny*nx, nc );  
k = 2; % number of clusters 
[l0 c] = kmeans( d, k ); 
l0 = reshape( l0, ny, nx ); 

% For each class, the data term Dc measures the distance of 
% each pixel value to the class prototype. For simplicity, standard 
% Euclidean distance is used. Mahalanobis distance (weighted by class 
% covariances) might improve the results in some cases.  Note that the 
% image intensity values are in the [0,1] interval, which provides 
% normalization.   

Dc = zeros( ny, nx, k ); 
for i = 1:k 
  dif = d - repmat( c(i,:), ny*nx,1 ); 
  Dc(:,:,i) = reshape( sum(dif.^2,2), ny, nx ); 
end

似乎该方法使用k-means聚类来初始化图并获得数据项Dc。但是，我不明白他们如何计算这个数据项。他们为什么使用

dif = d - repmat( c(i,:), ny*nx,1 );

在你所说的 cmets 中，数据项 Dc 测量每个像素值到类原型的距离。什么是类原型，为什么可以通过 k-means 标签确定？

在另一个实现 Graph_Cut_Demo2 中，它使用了

% calculate the data cost per cluster center
Dc = zeros([sz(1:2) k],'single');
for ci=1:k
    % use covariance matrix per cluster
    icv = inv(cov(d(l0==ci,:)));    
    dif = d- repmat(c(ci,:), [size(d,1) 1]);
    % data cost is minus log likelihood of the pixel to belong to each
    % cluster according to its RGB value
    Dc(:,:,ci) = reshape(sum((dif*icv).*dif./2,2),sz(1:2));
end

这让我很困惑。为什么他们计算协方差矩阵以及他们如何使用负对数似然形成数据项？有任何可用于这些实施的论文或描述吗？

非常感谢。

【问题讨论】：

标签： matlab graph image-segmentation

【解决方案1】：

两个图切分割示例都密切相关。 Image Processing, Analysis, and Machine Vision: A MATLAB Companion book 的作者（第一个例子）使用了Shai Bagon 的graph cut wrapper 代码（自然得到了作者的许可）——第二个例子。

那么，数据项到底是什么？
数据项表示每个像素独立如何可能属于每个标签。这就是使用对数似然项的原因。

更具体地说，在这些示例中，您尝试根据图像的颜色将图像分割成k 段。您假设图像中只有k 主色（这不是一个非常实际的假设，但足以用于教育目的）。使用k-意味着您尝试找出这两种颜色是什么。 k-means 的输出是 k RGB 空间的中心 - 即 k “代表”颜色。每个像素属于任何k 中心的可能性与像素与代表k-th 中心的距离（在颜色空间中）成反比：距离越大，像素属于的可能性越小对于k-th 中心，一个必须“支付”该像素以将该像素分配给k-th 簇的一元能量损失越高。
第二个示例将这一概念提前了一步，并假设k 簇在颜色空间中可能具有不同的密度，并使用每个簇的协方差矩阵对这种二阶行为进行建模。

在实践中，对每个段使用更复杂的颜色模型，通常是高斯混合。您可以在 GrabCut 的开创性论文中了解它（第 3 节）。

PS，
下次你可以直接给 Shai Bagon 发邮件问问。

【讨论】：