【问题标题】:Efficient mapping of a function with multiple arguments along an axis of an ndarray沿 ndarray 轴的具有多个参数的函数的有效映射
【发布时间】:2020-05-05 01:20:51
【问题描述】:

我有两个ndarrays — 一个用于values,另一个用于weights(源自values 上的错误)。我有兴趣获得axis=1 的平均值和标准偏差values ndarray。为了清楚起见,这里有一个玩具结构可以完成这项任务。

values=[[0.25,0.34,0.28,0.54],[0.23,0.38,0.29,0.55],[0.21,0.36,0.31,0.56]] 
errors=[[0.02,0.01,0.03,0.01],[0.01,0.02,0.03,0.01],[0.04,0.03,0.01,0.02]] 

def invsqerr(x):
    return 1/x**2

weights=np.apply_along_axis(invsqerr, 1, errors)


def wavg_std(y_arr, invsqerr_arr):
    average = np.average(y_arr, weights=invsqerr_arr)
    variance = np.average((y_arr-average)**2, weights=invsqerr_arr)
    return (average, math.sqrt(variance))


for k in range(len(values[0])):
    print (wavg_std([i[k] for i in values], [i[k] for i in weights]))

输出:

(0.23285714285714285, 0.009331389496316869)
(0.34897959183673471, 0.015681120581468193)
(0.30545454545454542, 0.009875254992000192)
(0.54666666666666663, 0.006666666666666672)

在我的情况下,len(values[0])(参考 for 循环)大约是几百万。 for loop 似乎不是这种大型阵列的正确方法。

寻找一种有效的方法,也许是基于np.apply_along_axis 的多个参数。

【问题讨论】:

    标签: python performance apply


    【解决方案1】:

    这是一个使用 Python 3 的 multiprocessing 模块的有效解决方案。

    首先,我转置了 ndarrays valuesweights

    values=np.stack(values).transpose()
    weights=np.stack(weights).transpose()
    

    然后利用starmap:

    if __name__ == '__main__':
        with multiprocessing.Pool() as pool:
            results = pool.starmap(wavg_std, zip(values, weights))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-02-01
      • 1970-01-01
      • 2022-10-30
      • 1970-01-01
      • 1970-01-01
      • 2020-07-19
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多