过滤数组，存储邻接信息答案

【问题标题】：Filter array, store adjacency information过滤数组，存储邻接信息
【发布时间】：2017-04-07 13:16:35
【问题描述】：

假设我有一个 array 的 2D (N, N) 形状：

import numpy as np
my_array = np.random.random((N, N))

现在我只想对这个数组的一些“单元格”进行一些计算，例如数组中心部分的单元格。为了避免对我不感兴趣的单元格进行计算，我通常在这里做的是创建一个布尔掩码，本着这种精神：

my_mask = np.zeros_like(my_array, bool)
my_mask[40:61,40:61] = True
my_array[my_mask] = some_twisted_computations(my_array[my_mask])

但是，如果some_twisted_computations() 涉及相邻单元格的值，如果它们位于mask 内，该怎么办？在性能方面，创建一个具有(len(my_mask), 4) 形状的“邻接数组”，将 4 个连接的相邻单元的索引存储在我将在some_twisted_computations() 中使用的平面my_array[mask] 数组中是否是个好主意？如果是，计算这种邻接数组的有效选择是什么？我应该切换到较低级别的语言/其他数据结构吗？

我的真实世界数组形状大约是(1000,1000,1000)，掩码只涉及这些值的一小部分（~100000），并且具有相当复杂的几何形状。我希望我的问题有意义...

编辑：我制定的非常肮脏和缓慢的解决方案：

wall = mask

i = 0

top_neighbors = []
down_neighbors = []
left_neighbors = []
right_neighbors = []
indices = []

for index, val in np.ndenumerate(wall):
    if not val:
        continue
    indices += [index]
    if wall[index[0] + 1, index[1]]:
        down_neighbors += [(index[0] + 1, index[1])]
    else:
        down_neighbors += [i]
    if wall[index[0] - 1, index[1]]:
        top_neighbors += [(index[0] - 1, index[1])]
    else:
        top_neighbors += [i]
    if wall[index[0], index[1] - 1]:
        left_neighbors += [(index[0], index[1] - 1)]
    else:
        left_neighbors += [i]
    if wall[index[0], index[1] + 1]:
        right_neighbors += [(index[0], index[1] + 1)]
    else:
        right_neighbors += [i]
    i += 1


top_neighbors = [i if type(i) is int else indices.index(i) for i in top_neighbors]
down_neighbors = [i if type(i) is int else indices.index(i) for i in down_neighbors]
left_neighbors = [i if type(i) is int else indices.index(i) for i in left_neighbors]
right_neighbors = [i if type(i) is int else indices.index(i) for i in right_neighbors]

【问题讨论】：

最佳答案可能取决于您要执行的计算的性质。例如，如果它们可以表示为对相邻像素的求和，那么像 np.convolve 或 scipy.signal.fftconvolve 这样的东西可能是一个非常好的解决方案。
很遗憾我觉得不能用卷积来表示，实际上我会做很多迭代，每一步都需要更新的值。

标签： python arrays algorithm numpy

【解决方案1】：

最佳答案可能取决于您要执行的计算的性质。例如，如果它们可以表示为对相邻像素的求和，那么像 np.convolve 或 scipy.signal.fftconvolve 这样的东西可能是一个非常好的解决方案。

对于有效生成相邻索引数组的具体问题，您可以尝试以下方法：

x = np.random.rand(100, 100)
mask = x > 0.9

i, j = np.where(mask)

i_neighbors = i[:, np.newaxis] + [0, 0, -1, 1]
j_neighbors = j[:, np.newaxis] + [-1, 1, 0, 0]

# need to do something with the edge cases
# the best choice will depend on your application
# here we'll change out-of-bounds neighbors to the
# central point itself.
i_neighbors = np.clip(i_neighbors, 0, 99)
j_neighbors = np.clip(j_neighbors, 0, 99)

# compute some vectorized result over the neighbors
# as a concrete example, here we'll do a standard deviation
result = x[i_neighbors, j_neighbors].std(axis=1)

结果是对应于被屏蔽区域的值数组，包含相邻值的标准差。希望这种方法适用于您想到的任何具体问题！

编辑：鉴于上面已编辑的问题，以下是我的回答如何适应以矢量化方式生成索引数组：

x = np.random.rand(100, 100)
mask = x > -0.9

i, j = np.where(mask)

i_neighbors = i[:, np.newaxis] + [0, 0, -1, 1]
j_neighbors = j[:, np.newaxis] + [-1, 1, 0, 0]
i_neighbors = np.clip(i_neighbors, 0, 99)
j_neighbors = np.clip(j_neighbors, 0, 99)

indices = np.zeros(x.shape, dtype=int)
indices[mask] = np.arange(len(i))

neighbor_in_mask = mask[i_neighbors, j_neighbors]

neighbors = np.where(neighbor_in_mask,
                     indices[i_neighbors, j_neighbors],
                     np.arange(len(i))[:, None])

left_indices, right_indices, top_indices, bottom_indices = neighbors.T

【讨论】：

感谢您的帮助。我不认为这是我在这里所期望的。我需要一个平面数组中邻居索引的明确列表。我已经添加了我写的-希望工作-肮脏的解决方案，所以它更清楚（？）我期望的输出。有可能我只是不明白你的答案。 :)
好的。我已经编辑了我的答案以解决已澄清的问题
... 你的答案是 pythonic 和高效的。谢谢！