这里通常给出的答案是基于bsxfun(参见例如[1])。我提出的方法基于矩阵乘法,结果证明比我能找到的任何可比算法都要快:
helpA = zeros(numA,3*d);
helpB = zeros(numB,3*d);
for idx = 1:d
helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];
helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 , B(:,idx), ones(numB,1)];
end
distMat = helpA * helpB';
请注意:
对于常量d,可以通过硬编码实现替换for-loop,例如
helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,1), A(:,1).^2, ... % d == 2
ones(numA,1), -2*A(:,2), A(:,2).^2 ]; % etc.
评估:
%% create some points
d = 2; % dimension
numA = 20000;
numB = 20000;
A = rand(numA,d);
B = rand(numB,d);
%% pairwise distance matrix
% proposed method:
tic;
helpA = zeros(numA,3*d);
helpB = zeros(numB,3*d);
for idx = 1:d
helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];
helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 , B(:,idx), ones(numB,1)];
end
distMat = helpA * helpB';
toc;
% compare to pdist2:
tic;
pdist2(A,B).^2;
toc;
% compare to [1]:
tic;
bsxfun(@plus,dot(A,A,2),dot(B,B,2)')-2*(A*B');
toc;
% Another method: added 07/2014
% compare to ndgrid method (cf. Dan's comment)
tic;
[idxA,idxB] = ndgrid(1:numA,1:numB);
distMat = zeros(numA,numB);
distMat(:) = sum((A(idxA,:) - B(idxB,:)).^2,2);
toc;
结果:
Elapsed time is 1.796201 seconds.
Elapsed time is 5.653246 seconds.
Elapsed time is 3.551636 seconds.
Elapsed time is 22.461185 seconds.
如需更详细的评估 w.r.t.数据点的维度和数量遵循下面的讨论 (@cmets)。事实证明,在不同的设置中应该首选不同的算法。在非时间紧迫的情况下,只需使用pdist2 版本。
进一步发展:
可以考虑用基于相同原理的任何其他度量来替换平方欧几里得:
help = zeros(numA,numB,d);
for idx = 1:d
help(:,:,idx) = [ones(numA,1), A(:,idx) ] * ...
[B(:,idx)' ; -ones(1,numB)];
end
distMat = sum(ANYFUNCTION(help),3);
尽管如此,这非常耗时。将较小的d 3 维矩阵help 替换为d 2 维矩阵可能很有用。特别是对于d = 1,它提供了一种通过简单的矩阵乘法计算成对差异的方法:
pairDiffs = [ones(numA,1), A ] * [B'; -ones(1,numB)];
你还有什么想法吗?