【问题标题】:Delete a series of elements every nth time in numpy array在numpy数组中每n次删除一系列元素
【发布时间】:2017-04-11 11:40:59
【问题描述】:

我知道如何删除 numpy 数组中的每 4 个元素:

frame = np.delete(frame,np.arange(4,frame.size,4))

现在我想知道是否有一个简单的命令可以删除每第 n 个(例如 4)乘以 3 个值。

一个基本的例子:

输入:[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20....]

会导致:

输出:[1,2,3,7,8,9,13,14,15,19,20,....]

我希望有一个简单的 numpy / python 功能,而不是编写一个必须遍历向量的函数(因为在我的情况下它很长,...)。

感谢您的帮助

【问题讨论】:

    标签: python arrays numpy


    【解决方案1】:

    使用布尔索引的方法:

    def block_delete(a, n, m):  #keep n, remove m
        mask = np.tile(np.r_[np.ones(n), np.zeros(m)].astype(bool), a.size // (n + m) + 1)[:a.size]
        return a[mask]
    

    与@Divakar 比较:

    def mod_delete(a, n, m):
        return a[np.mod(np.arange(a.size), n + m) < n]
    
    a = np.arange(19) + 1
    
    %timeit block_delete(a, 3, 4)
    10000 loops, best of 3: 50.6 µs per loop
    
    %timeit mod_delete(a, 3, 4)
    The slowest run took 9.37 times longer than the fastest. This could mean that an intermediate result is being cached.
    100000 loops, best of 3: 5.69 µs per loop
    

    让我们尝试一个更长的数组:

    a = np.arange(999) + 1
    
    %timeit block_delete(a, 3, 4)
    The slowest run took 4.61 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000 loops, best of 3: 54.8 µs per loop
    
    %timeit mod_delete(a, 3, 4)
    The slowest run took 5.13 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000 loops, best of 3: 14.5 µs per loop
    

    还有更长的时间:

    a = np.arange(999999) + 1
    
    %timeit block_delete(a, 3, 4)
    100 loops, best of 3: 3.93 ms per loop
    
    %timeit mod_delete(a, 3, 4)
    100 loops, best of 3: 12.3 ms per loop
    

    所以哪个更快取决于你的数组的大小

    【讨论】:

      【解决方案2】:

      方法#1:这是modulusboolean-indexing的一种方法-

      a[np.mod(np.arange(a.size),6)<3]
      

      作为一个函数,它会转化为:

      def select_in_groups(a, M, N): # Keep first M, delete next N and so on.
          return a[np.mod(np.arange(a.size),M+N)<M]
      

      示例逐步运行 -

      # Input array
      In [361]: a
      Out[361]: 
      array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
             18, 19, 20])
      
      # Create a range array that spans along the length of array
      In [362]: np.arange(a.size)
      Out[362]: 
      array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
             17, 18, 19])
      
      # Use modulus to create "intervaled" version of it that shifts at
      # the end of each group of 6 elements
      In [363]: np.mod(np.arange(a.size),6)
      Out[363]: array([0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1])
      
      # We need to select the first three as valid ones, so compare against 3
      # creating a boolean array or mask
      In [364]: np.mod(np.arange(a.size),6) < 3
      Out[364]: 
      array([ True,  True,  True, False, False, False,  True,  True,  True,
             False, False, False,  True,  True,  True, False, False, False,
              True,  True], dtype=bool)
      
      # Use the mask to select valid elements off array
      In [365]: a[np.mod(np.arange(a.size),6)<3]
      Out[365]: array([ 1,  2,  3,  7,  8,  9, 13, 14, 15, 19, 20])
      

      方法#2:为了提高性能,这里有另一种方法NumPy array strides -

      def select_in_groups_strided(a, M, N): # Keep first M, delete next N and so on.
          K = M+N
          na = a.size
          nrows = (1+((na-1)//K))
          n = a.strides[0]
          out = np.lib.index_tricks.as_strided(a, shape=(nrows,K), strides=(K*n,n))
          N = M*(na//K) + (na - (K*(na//K)))
          return out[:,:M].ravel()[:N]
      

      示例运行 -

      In [545]: a = np.arange(1,21)
      
      In [546]: a
      Out[546]: 
      array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
             18, 19, 20])
      
      In [547]: select_in_groups_strided(a,3,3)
      Out[547]: array([ 1,  2,  3,  7,  8,  9, 13, 14, 15, 19, 20])
      
      In [548]: a = np.arange(1,25)
      
      In [549]: a
      Out[549]: 
      array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
             18, 19, 20, 21, 22, 23, 24])
      
      In [550]: select_in_groups_strided(a,3,3)
      Out[550]: array([ 1,  2,  3,  7,  8,  9, 13, 14, 15, 19, 20, 21])
      

      运行时测试

      使用与@Daniel Forsman's timing tests 相同的设置-

      In [637]: a = np.arange(1,21)
      
      In [638]: %timeit block_delete(a,3,3)
      10000 loops, best of 3: 21 µs per loop
      
      In [639]: %timeit select_in_groups_strided(a,3,3)
      100000 loops, best of 3: 6.44 µs per loop
      
      In [640]: a = np.arange(1,2100)
      
      In [641]: %timeit block_delete(a,3,3)
      10000 loops, best of 3: 27 µs per loop
      
      In [642]: %timeit select_in_groups_strided(a,3,3)
      100000 loops, best of 3: 9.1 µs per loop
      
      In [643]: a = np.arange(999999) + 1
      
      In [644]: %timeit block_delete(a,3,3)
      100 loops, best of 3: 2.24 ms per loop
      
      In [645]: %timeit select_in_groups_strided(a,3,3)
      1000 loops, best of 3: 1.12 ms per loop
      

      Strided 可以很好地适应不同大小,如果您考虑性能的话。

      【讨论】:

      • 谢谢你这个作品,你能解释一下这个命令的作用吗?我真的不知道“,6”是什么意思
      • 是的,我想会有一个as_strided 的答案,只是我无法理解它。
      【解决方案3】:
      import numpy as np
      a = np.array([10, 0, 0, 20, 0, 30, 40, 50, 0, 60, 70, 80, 90, 100,0])
      print("Original array:")
      print(a)
      index=np.zeros(0)
      for i in range(len(a)):
          if a[i]==0:
              index=np.append(index, i)
      print("index=",index)
      new_a=np.delete(a,index)
      print("new_a=",new_a)
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2020-01-31
        • 2018-02-22
        • 2014-11-24
        • 2018-11-16
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多