【问题标题】:How can a numpy.ufunc.reduceat indices be generated from Python Slice Object如何从 Python Slice 对象生成 numpy.ufunc.reduceat 索引
【发布时间】:2017-01-11 16:22:08
【问题描述】:

假设我有一个像x[p:-q:n]x[::n] 这样的切片,我想用它来生成要传递给numpy.ufunc.reduceat(x, [p, p + n, p + 2 * n, ...])numpy.ufunc.reduceat(x, [0, n, 2 * n, ...]) 的索引。完成它的最简单有效的方法是什么?

【问题讨论】:

  • 为什么不直接使用rangelist(range(0, len(x), n))
  • 这个效率最高吗?
  • 如果你想要一个索引列表,这很有效。
  • 这在一个非常大的循环中运行。让我担心的一件事是创建rangelist 对象。但是谢谢,如果我没有得到更优化的方式,我会使用它。
  • 试试np.arangenp.r_

标签: python numpy range slice numpy-ufunc


【解决方案1】:

以 cmets 为基础:

In [351]: x=np.arange(100)
In [352]: np.r_[0:100:10]
Out[352]: array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])
In [353]: np.add.reduceat(x,np.r_[0:100:10])
Out[353]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945], dtype=int32)
In [354]: np.add.reduceat(x,np.arange(0,100,10))
Out[354]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945], dtype=int32)
In [355]: np.add.reduceat(x,list(range(0,100,10)))
Out[355]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945], dtype=int32)
In [356]: x.reshape(-1,10).sum(axis=1)
Out[356]: array([ 45, 145, 245, 345, 445, 545, 645, 745, 845, 945])

和时间安排:

In [357]: timeit np.add.reduceat(x,np.r_[0:100:10])
The slowest run took 9.30 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 31.2 µs per loop
In [358]: timeit np.add.reduceat(x,np.arange(0,100,10))
The slowest run took 85.75 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.69 µs per loop
In [359]: timeit np.add.reduceat(x,list(range(0,100,10)))
The slowest run took 4.31 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.9 µs per loop
In [360]: timeit x.reshape(-1,10).sum(axis=1)
The slowest run took 5.57 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.5 µs per loop

reduceatarange 看起来最好,但应该在更真实的数据上进行测试。在这个尺寸下,速度并没有什么不同。

r_ 的价值在于它可以让你使用方便的切片符号;它在一个名为 index_tricks.py 的文件中。

对于 10000 个元素 x,时间是 80、46、238、51。

【讨论】:

    猜你喜欢
    • 2011-01-19
    • 1970-01-01
    • 1970-01-01
    • 2019-10-18
    • 2017-07-24
    • 1970-01-01
    • 2019-12-19
    • 2011-05-22
    相关资源
    最近更新 更多