【问题标题】:running median excluding zeros不包括零的运行中位数
【发布时间】:2016-06-06 21:35:30
【问题描述】:

我借用了一些代码来计算数组的运行中位数。但是对于每个正在运行的数组,我想排除零值。下面是代码:

def RunningMedian(seq, M):
    seq = iter(seq)
    s = []
    m = M // 2

    # Set up list s (to be sorted) and load deque with first window of seq
    s = [item for item in islice(seq, M)]
    d = deque(s)
    # Simple lambda function to handle even/odd window sizes    
    median = lambda : s[m] if bool(M&1) else (s[m-1]+s[m]) * 0.5
    # Sort it in increasing order and extract the median ("center" of the sorted window)
    s.sort()
    # remove zeros from the array
    s = np.trim_zeros(s)
    print s
    medians = [median()]
    for item in seq:
        old = d.popleft()          # pop oldest from left
        d.append(item)             # push newest in from right
        del s[bisect_left(s, old)] # locate insertion point and then remove old 
        insort(s, item)            # insert newest such that new sort is not required        
        s = np.trim_zeros(s)
        print s
        medians.append(median())
    return medians

我正在测试代码,但它失败了。我的例子是a = np.array([5 2 0 9 4 2 6 8]),我把这个函数称为RunningMedian(a,3)。我想要的每个运行框是:

[2,5]
[2,9]
[4,9]
[2,4,9]
[2,4,6]
[2,6,8]

但是,在我调用上述函数后,它给出了:

[2, 5]
[2, 9]
[4, 9]
[2, 9]
[2, 6]
[2, 8]

而且它还返回错误的中值。
调用返回的中位数为:[5, 9, 9, 9, 6, 8]

谁能帮我解决这个问题?谢谢。

【问题讨论】:

    标签: python numpy median


    【解决方案1】:

    您的代码的主要问题是在 s 中丢弃零会影响所用对象的长度,这就解释了为什么最后没有得到 3 长度的窗口。

    我建议另一种方法:为median 使用适当的函数并在本地忽略那些零值。这样它就更干净了,而且你不需要trim_zeros(为此导入numpy 真的很糟糕)。根据您的功能,这就是我想出的:

    from itertools import islice
    from collections import deque
    from bisect import bisect_left,insort
    
    def median(s):
        sp = [nz for nz in s if nz!=0]
        print(sp)
        Mnow = len(sp)
        mnow = Mnow // 2
        return sp[mnow] if bool(Mnow&1) else (sp[mnow-1]+sp[mnow])*0.5
    
    def RunningMedian(seq, M):
        seq = iter(seq)
        s = []
        m = M // 2
    
        # Set up list s (to be sorted) and load deque with first window of seq
        s = [item for item in islice(seq, M)]
        d = deque(s)
        ## Simple lambda function to handle even/odd window sizes    
        #median = lambda: s[m] if bool(M&1) else (s[m-1]+s[m])*0.5
    
        # Sort it in increasing order and extract the median ("center" of the sorted window)
        s.sort()
        medians = [median(s)]
        for item in seq:
            old = d.popleft()          # pop oldest from left
            d.append(item)             # push newest in from right
            del s[bisect_left(s, old)] # locate insertion point and then remove old 
            insort(s, item)            # insert newest such that new sort is not required        
            medians.append(median(s))
        return medians
    

    大部分变化都在新的median 函数中,我将打印件移到了那里。我还添加了你的进口。请注意,我会以非常不同的方式处理这个问题,而且当前的“固定”版本很有可能带有鸭带的味道。

    无论如何,它似乎可以按您的意愿工作:

    >>> a = [5, 2, 0, 9, 4, 2, 6, 8]
    
    >>> RunningMedian(a,3)
    [2, 5]
    [2, 9]
    [4, 9]
    [2, 4, 9]
    [2, 4, 6]
    [2, 6, 8]
    [3.5, 5.5, 6.5, 4, 4, 6]
    

    您的版本中中位数关闭的原因是窗口的奇偶校验是根据输入窗口宽度M 确定的。如果你丢弃零,你最终会得到更小的(偶数长度)窗口。在这种情况下,您不需要中间(=second)元素,但您需要平均中间的两个元素。因此你的错误输出。

    【讨论】:

      【解决方案2】:

      尝试:

      [s[s!=0] for s in np.dstack((a[:-2], a[1:-1], a[2:]))[0]]
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-07-21
        • 2014-07-05
        相关资源
        最近更新 更多