有没有一种快速的方法来比较二维数组中的每两行？答案

【问题标题】：Is there a fast way to compare every two rows in a 2-dimensional array?有没有一种快速的方法来比较二维数组中的每两行？
【发布时间】：2021-04-17 08:55:33
【问题描述】：

所以我有一个二维数组，比如list：

list = [[x11, x12, x13, x14],
        [x21, x22, x23, x24],
       ...]

list 的一些示例是：

# numbers in list are all integers
list = [[0, 17, 6, 10],
        [0, 7, 6, 10],
        ]
list = [[6, 50, 6, 10],
        [0, 50, 6, 10],
        ]
list = [[6, 16, 6, 10],
        [6, 6, 6, 10],
        ]
list = [[0, 50, 6, 10],
        [6, 50, 6, 10],
        [6, 40, 6, 10]
        ]
list = [[0, 27, 6, 10],
        [0, 37, 6, 10],
        ]

我需要每两行迭代一次，例如[x11，x12，x13，x14]和[x21，x22，x23，x24]，并做一些复杂的比较：

cnt1 = cnt2 = cnt3 = cnt4 = cnt5 = 0
for i in range(0, length):
    for j in range(i + 1, length):
        if (list[i][0] + list[i][2] == list[j][0] or list[j][0] + list[j][2] == list[i][0]) and \
                list[i][1] == list[j][1]:
            cnt1 += 1
            if list[i][3] == list[j][3]:
                cnt2 += 1
            else
                cnt3 += 1
        elif (list[i][1] + list[i][3] == list[j][1] or list[j][1] + list[j][3] == list[i][1]) and \
                list[i][0] == list[j][0]:
            cnt4 += 1
            if list[i][2] == list[j][2]:
                cnt2 += 1
            else
                cnt3 += 1
        else
            cnt5 += 1
# do something with the counts

length这里通常很小，但是这个嵌套循环运行了数千次，所以完成程序需要很长时间。我已经阅读了一些在 Numpy 中进行矢量化的教程，但由于逻辑有点复杂，因此无法弄清楚如何编辑代码。有没有办法优化我的代码，哪怕只是一点点？任何帮助将不胜感激。提前致谢！

【问题讨论】：

看起来map() 在这里可以很好地为您服务
这能回答你的问题吗？ how to compare 2 columns of a 2d array at a time with columns of another array in python
@UmairMubeen 感谢您的回复。但我的更像是一个效率问题，而不是那个。
如果您追求的是矢量化，则需要填写所有#do somethings 以获得解决方案。如果这些条件是真实的，则取决于条件是什么以及您正在做什么，本机 numpy 矢量化可能/可能不可能。如果你想加速你的代码，你考虑过使用 Numba 吗？
@VortexYT 你能说得更具体一点吗？谢谢！

标签： python arrays numpy optimization vectorization

【解决方案1】：

我正在发布一个解决方案，说明如何针对第一个 if 以及随后的 if 和 else 条件执行此操作。

您也可以按照类似的逻辑对其余部分执行相同的操作。

import numpy as np

arr = np.array([[0, 17, 6, 10],
       [0, 7, 6, 10],
       [6, 50, 6, 10],
       [0, 50, 6, 10],
       [6, 16, 6, 10],
       [6, 6, 6, 10],
       [0, 50, 6, 10],
       [6, 50, 6, 10],
       [6, 40, 6, 10],
       [0, 27, 6, 10],
       [0, 37, 6, 10]])

N = len(arr)

cnt1 = cnt2 = cnt3 = cnt4 = cnt5 = 0
for i in range(0, N):
    for j in range(i + 1, N):
        if (arr[i][0] + arr[i][2] == arr[j][0] or arr[j][0] + arr[j][2] == arr[i][0]) and \
                arr[i][1] == arr[j][1]:
            cnt1 += 1
            if arr[i][3] == arr[j][3]:
                cnt2 += 1
            else:
                cnt3 += 1
        elif (arr[i][1] + arr[i][3] == arr[j][1] or arr[j][1] + arr[j][3] == arr[i][1]) and \
                    arr[i][0] == arr[j][0]:
            cnt4 += 1
            if arr[i][2] == arr[j][2]:
                cnt2 += 1
            else:
                cnt3 += 1
        else:
            cnt5 += 1

# this corresponds to (arr[i][0] + arr[i][2] == arr[j][0] or arr[j][0] + arr[j][2] == arr[i][0])
cnt1_bool_c1 = ((arr[:, 0] + arr[:, 2])[:, None] == arr[:, 0][None, :])

# arr[i][1] == arr[j][1]:
cnt1_bool_c2 = arr[:, 1][:, None] == arr[:, 1][None, :]

# So that i and j are compared only if i != j
cnt1_bool_c2[np.arange(N), np.arange(N)] = False

# doing and of the two previous conditions finishing the very first if condition
cnt1_bool = np.bitwise_and(cnt1_bool_c1, cnt1_bool_c2)

# corresponds to cnt1
cnt1_n = cnt1_bool.sum()

# verified
print(cnt1 == cnt1_n)

# corresponds to arr[i][3] == arr[j][3]
cnt2_bool_c = arr[:, 3][:, None] == arr[:, 3][None, :]

# So that i and j are compared only if i != j
cnt2_bool_c[np.arange(N), np.arange(N)] = False

# correspond to the inner if, count only if these elemets share the same position as the previous elements
cnt2_n1 = np.bitwise_and(cnt1_bool, cnt2_bool_c).sum()  # corresponds to the cnt2 += 1 in the first inner condition

# correspond to the inner else, count only if these elemets do not share the same position as the previous elements
cnt3_n1 = np.bitwise_and(cnt1_bool, ~cnt2_bool_c).sum()  # corresponds to the cnt3 += 1 in the first inner else condition

【讨论】：

非常感谢！我现在就去看看！
嗨！按照您的逻辑，我还对另一个循环进行了矢量化。不幸的是，它并不顺利，但我不知道为什么。 !(arr[i][0] + arr[i][2] <= arr[j][0] or arr[j][0] + arr[j][2] <= arr[i][0] or arr[i][1] + arr[i][3] <= arr[j][1] or arr[j][1] + arr[j][3] <= arr[i][1]) && i != j。我的矢量化形式是：bool_c1 = (arr[:, 0] + arr[:, 2])[:, None] <= arr[:, 0][None, :] bool_c2 = (arr[:, 1] + arr[:, 3])[:, None] <= arr[:, 1][None, :] bool_c12 = ~np.bitwise_or(bool_c1, bool_c2) bool_c12[np.arange(N), np.arange(N)] = False cnt = bool_c12.sum() 你能看看这个吗？谢谢！
你到底遇到了什么问题？这个问题对我来说解决整个问题非常耗时，而且这也不是 SO 的真正含义。但是，如果您确实无法解决它，并且对实施有一些具体问题，也许您可以将其作为另一个问题打开 该具体问题
哦，对不起，我只是觉得一个和上一个非常相似，并认为也许你可以很快找出我错在哪里。但后来我找到了另一种解决方法。再次感谢您的时间和耐心，祝您有美好的一天！

【解决方案2】：

在您的 for 循环中，您将数组 [x11, x12, x13, x14] 与所有以下元素（[x21, x22, x23, x24]、[x31, x32, x33, x34]、[x41, x42, x43, x44] 等）进行比较，

然后您继续将[x21, x22, x23, x24] 与所有以下元素（[x31, x32, x33, x34]、[x41, x42, x43, x44] 等）进行比较。

要迭代每 2 行并将它们 2 与 2 进行比较（这意味着 x1 与 x2，然后 x3 与 x4）您需要这样的东西：

for i in range(0, length - 1, 2):
    j = i + 1;
    if (list[i][0] + list[i][2] == list[j][0] or list[j][0] + list[j][2] == list[i][0]) and list[i][1] == list[j][1]:
        # do something
        if list[i][3] == list[j][3]:
            # do something
        else
            # do something

请注意，您还必须解决列表数组大小为奇数的情况。

【讨论】：

谢谢。我试试看。