矩阵乘法与7D permute
% Get sizes
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
% Perform kron format elementwise multiplication betwen the first two dims
% of X and Y, keeping the third dim aligned and "pushing out" leftover dims
% from Y to the back
mults = bsxfun(@times,permute(X,[4,2,5,1,3]),permute(Y,[1,6,2,7,3,4,5]));
mults3D = reshape(mults,m1*n1,m2*n2,[]);
Emults3D = reshape(E*reshape(mults3D,size(mults3D,1),[]),size(mults3D));
% Trace summations by using linear indices of diagonal on 3D slices in Emults3D
MN = m1*n1;
idx = 1:MN+1:MN^2;
idx2D = bsxfun(@plus,idx(:),MN^2*(0:size(Emults3D,3)-1));
pr_sums = sum(Emults3D(idx2D),1);
% Perform "M/pr" equivalent elementwise divisions and then use
% matrix-multiplication to reduce the iterative summations
Mp = bsxfun(@rdivide,mults3D,reshape(pr_sums,1,1,[]));
out = reshape(Mp,[],size(Mp,3))*reshape(permute(H,[3,1,2]),[],1);
out = reshape(out,m1*n1,m2*n2);
基准测试
输入是这样设置的 -
% Size parameter
n = 5;
% Setup inputs
X = rand(n,n,n);
Y = rand(n,n,n,n,n);
E = rand(n*n,n*n)>0.5;
H = rand(n,n,n);
num_iter = 500; % Number of iterations to run the approaches for
运行时结果是 -
----------------------------- With Loop
Elapsed time is 8.806286 seconds.
----------------------------- With Vectorization
Elapsed time is 1.471877 seconds.
将大小参数n 设置为10,运行时为-
----------------------------- With Loop
Elapsed time is 5.068872 seconds.
----------------------------- With Vectorization
Elapsed time is 4.399783 seconds.