在python中的二维数组中计算和查找模式答案

【问题标题】：Count and find pattern in 2D array in python在python中的二维数组中计算和查找模式
【发布时间】：2020-04-09 13:38:28
【问题描述】：

我有以下数据：

data = np.array([[1, 0,-1, 0, 0, 1, 0,-1, 0, 0, 1],
                 [1, 1, 0, 0,-1, 0, 1, 0, 0,-1, 0],
                 [1, 0, 0, 1, 0, 0,-1, 0, 1, 0, 0],
                 [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0]])

我想计算每行中有多少0，计数是每个相邻的0。我希望的结果是这样的新数组：

[[1 2 1 2]
 [2 1 2 1]
 [2 2 1 2]
 [2 5 2]]

我想搜索 2 1 2 模式比率（也在每一行中）有一定的公差（如果数字略有偏差）并将 1 的坐标保存在模式中。

所以，我会找到212，或424，或636，或9 5 10（容忍）等

预期结果：

[[0,6],[1,5],[2,7]]

那些是data数组的212模式中每个1的位置

我已尝试使用以下代码：

np.unique(data, return_counts=True, axis=1)

我对此进行了摆弄，结果并不像我预期的那样。这是用于图像处理的，数据量很大

【问题讨论】：

如果连续有多个2 1 2 模式，您希望如何捕获这些索引？如果连续有重叠的2 1 2（例如2 1 2 1 2 1 2 1 2）模式，您希望如何捕获这些索引？
@wwii 这不会发生在我的数据中
...查看我的编辑 - 它应该符合。
您的示例数据有 11 列 - 真实数据是否只有 11 列？

标签： python arrays numpy

【解决方案1】：

data = np.array([[1, 0,-1, 0, 0, 1, 0,-1, 0, 0, 1],
                 [1, 1, 0, 0,-1, 0, 1, 0, 0,-1, 0],
                 [1, 0, 0, 1, 0, 0,-1, 0, 1, 0, 0],
                 [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0]])
a = data

计算每行中的连续零：
Numpy 和 Python 循环。
遍历行；找到零的索引；拆分索引差异超过一个的行；得到结果的形状。

for row in a:
    zeros = np.where(row==0)[0]
    neighbors = (np.argwhere(np.diff(zeros)>1)+1).ravel()
    w = np.split(zeros,neighbors)
    counts = [thing.shape[0] for thing in w]
    print(counts)

模式索引：
使用一些广播 - 一次对所有行进行操作，同时对列进行迭代

# pattern to search for:
# notzero,zero,zero,notzero,zero,notzero,zero,zero,notzero
pattern = np.array([False,True,True,False,True,False,True,True,False])    

# find zeros in data and pad
padded = np.pad(a==0,1)
dif = padded.shape[1] - pattern.shape[0]
for i in range(dif+1):
    stop = i+pattern.shape[0]
    test = padded[:,i:stop]
    equal =  test == pattern
    equal = np.all(equal,1)
    if any(equal):
        row = np.argwhere(equal).ravel()[0]
        print(f'[{row-1},{i+3}]')

这应该会在一行中找到多个（分离的和重叠的）模式 - 似乎适用于：

data = np.array([[1, 0,-1, 0, 0, 1, 0,-1, 0, 0, 1, 0,-1, 0, 0, 1,-1, 0, 0, 1, 0,-1, 0, 0],
                 [1, 1, 0, 0,-1, 0, 1, 0, 0,-1, 0, 1, 0, 0,-1, 0, 0, 0,-1, 0, 1, 0, 0,-1],
                 [1, 0, 0, 1, 0, 0,-1, 0, 1, 0, 0,-1, 0, 1, 0, 0, 0, 1, 0, 0,-1, 0, 1, 0],
                 [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0]])

【讨论】：

如何搜索我在问题中描述的模式
谢谢，我认为它不适用于大量数据。我是多么愚蠢，我忘记了比率，我想找到212 模式比率（有一定的公差），而不是确切的模式。不管怎么说，还是要谢谢你。我只是找到了另一个方法来实现这个的真正目的
@ircham - 请发布酸溶液作为答案。 Can I answer my own question?
我认为这不是这个基本问题的解决方案。真正的目的是我想检测图像中的二维码位置检测模式。这个问题之所以发布是因为我是从基础开始的。然后我找到了另一种方法，不处理图像中的数组数据

【解决方案2】：

改编来自cumsum with reset 的@jezrael 答案，并假设您可以添加pandas 依赖项：

import pandas as pd
import numpy as np

data = np.array([[1, 0,-1, 0, 0, 1, 0,-1, 0, 0, 1],
                 [1, 1, 0, 0,-1, 0, 1, 0, 0,-1, 0],
                 [1, 0, 0, 1, 0, 0,-1, 0, 1, 0, 0],
                 [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0]])

df = pd.DataFrame(data.T, columns=range(data.shape[0]))
a = (df == 0)
df = a.cumsum()-a.cumsum().where(~a).ffill().fillna(0).astype(int)

# Add a last row of zeros 
df.loc[len(df)] = 0

# Define custom function to apply column-wise
def find_pattern(col):
    c = col.to_numpy()
    ids = np.argwhere(c==0) - 1 
    ids = ids[ids>=0]
    return [x for x in c[ids] if x!=0]

r = df.apply(lambda col: find_pattern(col), axis=0)

r
0    [1, 2, 1, 2]
1    [2, 1, 2, 1]
2    [2, 2, 1, 2]
3       [2, 5, 2]
dtype: object

结果r 是一个按行索引索引的熊猫系列，预期输出为值。

终于找到[2,1,2]模式，可以再次使用pandas功能了：

r = pd.DataFrame(r, columns=['zeros'])
r['string_col'] = r['zeros'].apply(lambda row: ''.join([str(x) for x in row]))

pattern_as_string = '212'
r['pattern_index'] = r['string_col'].str.find(pattern_as_string)

         zeros  string_col  pattern_index
0  [1, 2, 1, 2]       1212              1
1  [2, 1, 2, 1]       2121              0
2  [2, 2, 1, 2]       2212              1
3     [2, 5, 2]        252             -1

pattern_index 是模式开始的值，如果没有找到，则为 -1。

【讨论】：

它可以计算0。但仍然对如何查找2 1 2 模式及其在第一个数组中的位置感到困惑。我刚刚发现了 Pandas
@ircham 编辑了我的答案以添加模式搜索。希望对您有所帮助！
但我需要的是第一个数组（data）的模式索引，而不是新数组@FBruzzesi
那么我不确定您的预期输出是什么，因为两者之间还有其他值。请提供您期望的结果。