【问题标题】:Vectorizing an operation between all pairs of elements in two numpy arrays向量化两个 numpy 数组中所有元素对之间的操作
【发布时间】:2017-02-02 11:03:01
【问题描述】:

给定两个数组,其中每一行代表一个圆 (x, y, r):

data = {}
data[1] = np.array([[455.108, 97.0478, 0.0122453333],
                    [403.775, 170.558, 0.0138770952],
                    [255.383, 363.815, 0.0179857619]])
data[2] = np.array([[455.103, 97.0473, 0.012041],
                    [210.19, 326.958, 0.0156912857],
                    [455.106, 97.049, 0.0150472381]])

我想拉出所有不脱节的圆对。这可以通过以下方式完成:

close_data = {}
for row1 in data[1]: #loop over first array
    for row2 in data[2]: #loop over second array
        condition = ((abs(row1[0]-row2[0]) + abs(row1[1]-row2[1])) < (row1[2]+row2[2])) 
        if condition: #circles overlap if true
            if tuple(row1) not in close_data.keys():                           
                close_data[tuple(row1)] = [row1, row2] #pull out close data points
            else:
                close_data[tuple(row1)].append(row2)

for k, v in close_data.iteritems():
    print k, v 
#desired outcome   
#(455.108, 97.047799999999995, 0.012245333299999999)
#[array([  4.55108000e+02,   9.70478000e+01,   1.22453333e-02]), 
# array([  4.55103000e+02,   9.70473000e+01,   1.2040000e-02]), 
# array([  4.55106000e+02,   9.70490000e+01,   1.50472381e-02])]

但是,数组上的多个循环对于大型数据集来说效率非常低。是否可以对计算进行矢量化以便我获得使用 numpy 的优势?

【问题讨论】:

    标签: python arrays numpy geometry combinations


    【解决方案1】:

    最困难的一点实际上是获取信息的表示形式。哦,我插入了几个正方形。如果你真的不想要欧几里得距离,你必须改回来。

    import numpy as np
    
    data = {}
    data[1] = np.array([[455.108, 97.0478, 0.0122453333],
                        [403.775, 170.558, 0.0138770952],
                        [255.383, 363.815, 0.0179857619]])
    data[2] = np.array([[455.103, 97.0473, 0.012041],
                        [210.19, 326.958, 0.0156912857],
                        [455.106, 97.049, 0.0150472381]])
    
    d1 = data[1][:, None, :]
    d2 = data[2][None, :, :]
    dists2 = ((d1[..., :2] - d2[..., :2])**2).sum(axis = -1)
    radss2 = (d1[..., 2] + d2[..., 2])**2
    
    inds1, inds2 = np.where(dists2 <= radss2)
    
    # translate to your representation:
    
    bnds = np.r_[np.searchsorted(inds1, np.arange(3)), len(inds1)]
    rows = [data[2][inds2[bnds[i]:bnds[i+1]]] for i in range(3)]
    out = dict([(tuple (data[1][i]), rows[i]) for i in range(3) if rows[i].size > 0])
    

    【讨论】:

    • 很快就会去看看,谢谢!我绝不会嫁给最终的表现形式,我真的只需要一种方法来整理所有相关联的圈子,而且我拥有它的方式似乎是一种将其拉出循环的自然方式。
    【解决方案2】:

    这是一个纯粹的numpythonic方式(adata[1]bdata[2]):

    In [80]: p = np.arange(3) # for creating the indices of combinations using np.tile and np.repeat
    
    In [81]: a = a[np.repeat(p, 3)] # creates the first column of combination array
    
    In [82]: b = b[np.tile(p, 3)] # creates the second column of combination array
    
    In [83]: abs(a[:, :2] - b[:, :2]).sum(1) < a[:, 2] + b[:, 2]
    Out[83]: array([ True, False,  True,  True, False,  True,  True, False,  True], dtype=bool)
    

    【讨论】:

      猜你喜欢
      • 2023-03-19
      • 1970-01-01
      • 1970-01-01
      • 2021-11-16
      • 1970-01-01
      • 2020-01-25
      • 2021-07-14
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多