通过颜色变换矩阵矢量化乘以 RGB 数组，用于图像处理答案

【问题标题】：Vectorize multiplying RGB array by color transform matrix, for image processing通过颜色变换矩阵矢量化乘以 RGB 数组，用于图像处理
【发布时间】：2014-12-21 15:58:43
【问题描述】：

我正在自学使用 Python 3 进行彩色图像处理（NumPy 用于此特定任务）。

我有一个 3d 数组 image 代表图像每个像素的 RGB 值，因此它的形状是（高度、宽度、3）。在每个像素处，我想创建新的 RGB 值，这些值是该给定像素处原始 RGB 值的某种线性组合。我将通过将每个像素的 RGB 向量乘以 W（一个 3x3 的权重矩阵）来做到这一点。

我可以使用嵌套的 for 循环来完成此操作，但速度很慢：

newRGB = np.zeros((height,width,3))   # make empty array to update with RGB values
for i in range(height):
    for j in range(width):                     
        RGB = image[i,j,:]            # RGB vector at given pixel with size 3 since is [R,G,B]
        new = np.dot(W,RGB)           # W is 3x3 matrix of weights
        newRGB[i,j,:] = new           # put new RGB values into the empty matrix

另外，一种更快的矢量化方式是：

image = mpimg.imread('test.png')   # reading image file into matplotlib.image
print(image.shape)                 # image has shape (height,width,3)
W = np.array([...])                # arbitrary 3x3 matrix of weights  
x = np.rollaxis(image,2,1)         # moving the RGB axis to 2nd position
print(x.shape)                     # x has shape (height,3,width)
Wx = np.dot(W,x)                   # matrix multiplication
print(Wx.shape)                    # Wx has shape (3,height,width)
y = np.rollaxis(Wx,0,3)            # moving RGB axis back to 3rd position to have image shape
print(y.shape)                     # y has shape (height,width,3) like original image

有没有一种不那么繁琐的方法，例如通过使用 numpy.tensordot()？

另外，由于我采用 RGB 值的线性组合，我是否可以创建某种 3D 线性滤波器，并通过在 FFT 空间中进行简单的元素乘法将其与我的图像进行卷积？

现在我的图像大约为 1000x1000 像素，因此 RGB 阵列形状大致为 (1000,1000,3)。但我也对其他可能具有更大数组（或更高维度）的应用程序的矢量化感兴趣，因此也欢迎与更大数组大小和维度相关的答案。

【问题讨论】：

我不会称之为“代表RGB值的3d张量”，我只是说RGB数组（/RGBa数组）。它本质上是一组 3(/4) 个二维数组；第三维总是被理解为 RGB/RGBa。

标签： python image numpy vectorization rgb

【解决方案1】：

是的，您可以使用np.tensordot 或np.einsum：

In [9]: np.tensordot(image, W, ([2], [1])).shape
Out[9]: (1000, 1000, 3)

In [13]: np.einsum('ijk,lk->ijl', image, W).shape
Out[13]: (1000, 1000, 3)


In [19]: x = np.rollaxis(image,2,1)

In [20]: Wx = np.dot(W,x)

In [21]: y = np.rollaxis(Wx,0,3)

In [22]: np.allclose(np.tensordot(image, W, ([2], [1])), y)
Out[22]: True

In [14]: np.allclose(np.tensordot(image, W, ([2], [1])), np.einsum('ijk,lk->ijl', image, W))
Out[14]: True

在这两种方法中，np.tensordot 在这种情况下似乎是更快的方法。

In [15]: %timeit np.einsum('ijk,lk->ijl', image, W)
10 loops, best of 3: 31.1 ms per loop

In [16]: %timeit np.tensordot(image, W, ([2], [1]))
100 loops, best of 3: 18.9 ms per loop

【讨论】：