【问题标题】:Prepend Zero to Long Numpy Array将零添加到长 Numpy 数组
【发布时间】:2021-04-21 23:52:08
【问题描述】:

我有一个很长的一维数组,我想计算它的累积和,然后在结果数组的开头添加一个零。

import numpy as np

def padded_cumsum(x):
    cum_sum_x = np.cumsum(x)  # Creating a new array here is fine

    return np.pad(cum_sum_x, (1,0), mode='constant')  # Prepend a single zero w/ pad

x = np.array(range(2**25))
print padded_cumsum(x)

函数padded_cumsum 将被调用数十亿次,数组长度各不相同,因此我试图避免任何数组复制,因为这很昂贵。另外,我不能通过在开头/结尾使用额外的值/NaN 实例化原始数组 x 来更改它。由于无论如何都需要创建cum_sum_x,我怀疑我可以通过做一些hacky来潜入零:

def padded_cumsum(x):
    cum_sum_x = np.empty(x.shape[0]+1)
    cum_sum_x[0] = 0 
    cum_sum_x[1:] = np.cumsum(x)

    return np.pad(cum_sum_x, (1,0), mode='constant')  # Prepend a single zero w/ pad

【问题讨论】:

标签: python arrays numpy


【解决方案1】:

在手动分配的数组上使用 out 关键字。

out = np.empty(len(x)+pad, dtype=yourdtype)
np.cumsum(x, out=out[pad:])
out[:pad] = 0

【讨论】:

    【解决方案2】:

    你可以就地cumsum:

    def padcumsum(x):
        csum=np.hstack((0,x)) # 3x faster than pad.
        csum.cumsum(out=csum)
        return csum
    

    对于性能问题,您可以安装 numba :

    @numba.njit
    def perf(x):
        csum=np.empty(x.size+1,x.dtype)
        csum[0]=0
        for i in range(x.size):
            csum[i+1]=csum[i]+x[i]
        return csum
    

    比 padcumsum 快两倍

    【讨论】:

    • 但大概不会比保罗的回答快吗?
    • 大约快 2 倍。
    • 什么大小的数组?
    【解决方案3】:

    性能基准

    我比较了几个选项,主要来自herenp.concatenate(([0], x)).cumsum() 最快。

    结果

    x:问题大小,y:1000 次运行的计算时间

    代码

    import timeit
    import random
    import numpy as np
    import matplotlib.pyplot as plt
    
    cmds = [
        'np.r_[[0], x].cumsum()',
        'np.hstack(([0], x)).cumsum()',
        'np.concatenate(([0], x)).cumsum()',
        'csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])',
        ]
    test_range = [1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6]
    # test_range = [1e0, 1e1, 1e2]
    
    ts = np.empty((len(cmds), len(test_range)), dtype=float)
    for tt, size_float in enumerate(test_range):
        size = round(size_float)
        print('array size:', size)
        x = np.random.randint(low=0, high=100, size=size)
    
        n_trials = 1000
        for cc, cmd in enumerate(cmds):
    
            t = timeit.Timer(cmd, globals={**globals(), **locals()})
            t = t.timeit(n_trials)
            ts[cc, tt] = t
            print('time for {:d}x \"{:}\": {:.6f}'.format(n_trials, cmd, t))
    
    
    fig, ax = plt.subplots(1, 1, figsize=(15, 10))
    for cc, cmd in enumerate(cmds):
        ax.plot(test_range, ts[cc, :], label=cmd)
        print(cmd)
    ax.legend()
    ax.set_xscale('log')
    ax.set_yscale('log')
    

    输出

    array size: 1
    time for 1000x "np.r_[[0], x].cumsum()": 0.015609
    time for 1000x "np.hstack(([0], x)).cumsum()": 0.005469
    time for 1000x "np.concatenate(([0], x)).cumsum()": 0.002997
    time for 1000x "csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])": 0.003499
    array size: 10
    time for 1000x "np.r_[[0], x].cumsum()": 0.018170
    time for 1000x "np.hstack(([0], x)).cumsum()": 0.005663
    time for 1000x "np.concatenate(([0], x)).cumsum()": 0.002993
    time for 1000x "csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])": 0.003511
    array size: 100
    time for 1000x "np.r_[[0], x].cumsum()": 0.018444
    time for 1000x "np.hstack(([0], x)).cumsum()": 0.005621
    time for 1000x "np.concatenate(([0], x)).cumsum()": 0.003145
    time for 1000x "csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])": 0.003770
    array size: 1000
    time for 1000x "np.r_[[0], x].cumsum()": 0.018034
    time for 1000x "np.hstack(([0], x)).cumsum()": 0.007816
    time for 1000x "np.concatenate(([0], x)).cumsum()": 0.005275
    time for 1000x "csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])": 0.006885
    array size: 10000
    time for 1000x "np.r_[[0], x].cumsum()": 0.036433
    time for 1000x "np.hstack(([0], x)).cumsum()": 0.027001
    time for 1000x "np.concatenate(([0], x)).cumsum()": 0.024336
    time for 1000x "csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])": 0.034565
    array size: 100000
    time for 1000x "np.r_[[0], x].cumsum()": 0.228152
    time for 1000x "np.hstack(([0], x)).cumsum()": 0.219081
    time for 1000x "np.concatenate(([0], x)).cumsum()": 0.215639
    time for 1000x "csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])": 0.311723
    array size: 1000000
    time for 1000x "np.r_[[0], x].cumsum()": 2.693319
    time for 1000x "np.hstack(([0], x)).cumsum()": 2.656931
    time for 1000x "np.concatenate(([0], x)).cumsum()": 2.634273
    time for 1000x "csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])": 3.755322
    np.r_[[0], x].cumsum()
    np.hstack(([0], x)).cumsum()
    np.concatenate(([0], x)).cumsum()
    csp0 = np.zeros(shape=(len(x) + 1,)); np.cumsum(x, out=csp0[1:])
    

    【讨论】:

      猜你喜欢
      • 2019-08-17
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-04-22
      • 2011-02-10
      • 2014-03-05
      • 2017-12-27
      • 2012-08-26
      相关资源
      最近更新 更多