数组计算的向量化版本答案

【问题标题】：Vectorized version of array calculation数组计算的向量化版本
【发布时间】：2015-10-12 04:05:12
【问题描述】：

有没有办法对以下数组计算进行矢量化（即不使用 for 循环）：

for i in range(numCells):
    z[i] = ((i_mask == i)*s_image).sum()/pixel_counts[i]

s_image 是存储为二维 ndarray 的图像（为简单起见，我在此处删除了颜色维度）。 i_mask 也是一个与 s_image 大小相同的二维数组，但它包含整数，这些整数是长度为 numCells 的“单元”列表的索引。结果 z 是长度为 numCells 的一维数组。计算的目的是将掩码包含相同索引的所有像素值相加，并将结果放入 z 向量中。（pixel_counts 也是长度为 numCells 的一维数组）。

【问题讨论】：

也许我创建了一个数组或数据框，其中每个像素都有一行包含单元格索引值和颜色设置。然后对具有相同索引的所有行求和（如数据透视表）。这会更快吗？

标签： python-2.7 multidimensional-array vectorization

【解决方案1】：

作为一种矢量化方法，您可以像这样利用broadcasting 和matrix-multiplication -

# Generate a binary array of matches for all elements in i_mask against 
# an array of indices going from 0 to numCells 
matches = i_mask.ravel() == np.arange(numCells)[:,None]

# Do elementwise multiplication against s_image and sum those up for 
# each such index going from 0 to numCells. This is essentially doing 
# matix multiplicatio. Finally elementwise divide by pixel_counts 
out = matches.dot(s_image.ravel())/pixel_counts

或者，作为另一种矢量化方法，您也可以使用 np.einsum 进行乘法和求和，这可能会提高性能，就像这样 -

out = np.einsum('ij,j->i',matches,s_image.ravel())/pixel_counts

运行时测试 -

函数定义：

def vectorized_app1(s_image,i_mask,pixel_counts):
    matches = i_mask.ravel() == np.arange(numCells)[:,None]
    return matches.dot(s_image.ravel())/pixel_counts

def vectorized_app2(s_image,i_mask,pixel_counts):
    matches = i_mask.ravel() == np.arange(numCells)[:,None]
    return np.einsum('ij,j->i',matches,s_image.ravel())/pixel_counts

def org_app(s_image,i_mask,pixel_counts):
    z = np.zeros(numCells)
    for i in range(numCells):
        z[i] = ((i_mask == i)*s_image).sum()/pixel_counts[i]
    return z

时间安排：

In [7]: # Inputs
   ...: numCells = 100
   ...: m,n = 100,100
   ...: pixel_counts = np.random.rand(numCells)
   ...: s_image = np.random.rand(m,n)
   ...: i_mask = np.random.randint(0,numCells,(m,n))
   ...: 

In [8]: %timeit org_app(s_image,i_mask,pixel_counts)
100 loops, best of 3: 8.13 ms per loop

In [9]: %timeit vectorized_app1(s_image,i_mask,pixel_counts)
100 loops, best of 3: 7.76 ms per loop

In [10]: %timeit vectorized_app2(s_image,i_mask,pixel_counts)
100 loops, best of 3: 4.08 ms per loop

【讨论】：

非常感谢迪瓦卡！这很棒。以下是我使用的实际图像和尺寸的 numCells 的时间安排：In [643]: %timeit org_app(s_image, i_mask, pixel_counts) 1 loops, best of 3: 363 ms per loop In [644]: %timeit vectorized_app1(s_image, i_mask, pixel_counts) 1 loops, best of 3: 222 ms per loop In [645]: %timeit vectorized_app2(s_image, i_mask, pixel_counts) 1 loops, best of 3: 270 ms per loop

【解决方案2】：

这是我的解决方案（处理所有三种颜色）。不确定这有多有效。谁有更好的解决方案？

import numpy as np
import pandas as pd

# Unravel the mask matrix into a 1-d array
i = np.ravel(i_mask)

# Unravel the image into 1-d arrays for
# each colour (RGB)
r = np.ravel(s_image[:,:,0])
g = np.ravel(s_image[:,:,1])
b = np.ravel(s_image[:,:,2])

# prepare a dictionary to create the dataframe
data = {'i' : i, 'r' : r, 'g' : g, 'b' : b}

# create a dataframe
df = pd.DataFrame(data)

# Use pandas pivot table to average the colour
# intensities for each cell index value
pixAvgs = pd.pivot_table(df, values=['r', 'g', 'b'], index='i')
pixAvgs.head()

输出：

            b           g           r
i                                    
-1  26.719482   68.041868  101.603297
 0  75.432432  170.135135  202.486486
 1  92.162162  184.189189  208.270270
 2  71.179487  171.897436  201.846154
 3  76.026316  178.078947  211.605263

【讨论】：

【解决方案3】：

最后我用不同的方式解决了这个问题，它大大提高了速度。我没有像上面那样使用 i_mask，而是在输出强度的一维数组 z 中使用索引的二维数组，而是创建了一个不同的数组 mask1593，其维度为 (numCells x 45)。每行是扁平的 256x256 像素图像（0 到 65536）中大约 35 到 45 个索引的列表。

In [10]: mask1593[0]
Out[10]: 
array([14853, 14854, 15107, 15108, 15109, 15110, 15111, 15112, 15363,
       15364, 15365, 15366, 15367, 15368, 15619, 15620, 15621, 15622,
       15623, 15624, 15875, 15876, 15877, 15878, 15879, 15880, 16131,
       16132, 16133, 16134, 16135, 16136, 16388, 16389, 16390, 16391,
       16392,     0,     0,     0,     0,     0,     0,     0,     0], dtype=int32)

然后我能够使用 numpy 的高级索引实现如下相同的转换：

def convert_image(self, image_array):
    """Convert 256 x 256 RGB image array to 1593 RGB led intensities."""
    global mask1593
    shape = image_array.shape
    img_data = image_array.reshape(shape[0]*shape[1], shape[2])
    return np.mean(img_data[mask1593], axis=1)

结果如下！将 256x256 像素的彩色图像转换为 1593 种颜色的阵列，以在此不规则 LED 显示屏上显示：

【讨论】：

In [25]: %timeit dis.convert_image(img256) 100 loops, best of 3: 2.56 ms per loop