使用 numpy 自定义二维矩阵运算答案

【问题标题】：Custom 2D matrix operation using numpy使用 numpy 自定义二维矩阵运算
【发布时间】：2021-12-12 16:11:19
【问题描述】：

我有两个稀疏二进制矩阵 A 和 B 具有匹配的维度，例如A 的形状为 I x J，B 的形状为 J x K。我有一个自定义操作会产生一个形状为I x J x K 的矩阵C，其中每个元素(i,j,k) 是1，只有A(i,j) = 1 and B(j,k) = 1。我目前已经实现了这个操作如下：

import numpy as np

I = 2
J = 3
K = 4

A = np.random.randint(2, size=(I, J))
B = np.random.randint(2, size=(J, K))

# Custom method
C = np.zeros((I,J,K))
for i in range(I):
    for j in range(J):
        for k in range(K):
            if A[i,j] == 1 and B[j,k] == 1:
                C[i,j,k] = 1

print(C)

但是，对于大型 I,J,K，for 循环非常慢。是否可以使用 numpy 方法来实现此操作以加快速度？我看过np.multiply.outer，但到目前为止没有成功。

【问题讨论】：

你试过 numba 吗？如果矩阵是稀疏的，为什么不使用它们的稀疏表示？应该更快地只遍历包含非零元素的位置
@yannziselman 还没有，但我也在考虑。
稀疏矩阵通常用于相当大的数组。您确定要生成一个形状为 (I,J,K) 的非稀疏数组 C，而不仅仅是满足条件的索引吗？
@max9111 是的，实际上我使用大约 256 的尺寸，但还有更多的索引，所以 (I,J,K,L,M,...)。这意味着最终的数组不仅相当稀疏，而且对于 numpy 来说太大了
这看起来像是可以通过广播执行的外部产品。 scipy.sparse 仅限于 2d。

标签： python numpy matrix sparse-matrix multiplication

【解决方案1】：

给你：

C = np.einsum('ij,jk->ijk', A,B)

【讨论】：

总是np.einsum，谢谢！
还有A[:,:,None]*B

【解决方案2】：

尝试用 numba 做你已经在做的事情。这是一个使用您的代码、Sehan2 的方法和 numba 的示例：

import numpy as np
from numba import jit, prange

I = 2
J = 3
K = 4

np.random.seed(0)

A = np.random.randint(2, size=(I, J))
B = np.random.randint(2, size=(J, K))

# Custom method
def Custom_method(A, B):
    I, J = A.shape
    J, K = B.shape
    C = np.zeros((I,J,K))
    for i in range(I):
        for j in range(J):
            for k in range(K):
                if A[i,j] == 1 and B[j,k] == 1:
                    C[i,j,k] = 1
    return C

def Custom_method_ein(A, B):
    C = np.einsum('ij,jk->ijk', A,B)
    return C

@jit(nopython=True)
def Custom_method_numba(A, B):
    I, J = A.shape
    J, K = B.shape
    C = np.zeros((I,J,K))
    for i in prange(I):
        for j in prange(J):
            for k in prange(K):
                if A[i,j] == 1 and B[j,k] == 1:
                    C[i,j,k] = 1
    return C

print('original')
%timeit Custom_method(A, B)
print('einsum')
%timeit Custom_method_ein(A, B)
print('numba')
%timeit Custom_method_numba(A, B)

输出：

original
10000 loops, best of 5: 18.8 µs per loop
einsum
The slowest run took 20.51 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 3.32 µs per loop
numba
The slowest run took 15.99 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 815 ns per loop

请注意，如果您使用稀疏矩阵表示，您可以使您的代码运行得更快、更高效。这样可以避免执行不必要的操作。

【讨论】：