【问题标题】:Mean absolute deviation from numpy ndarray与 numpy ndarray 的平均绝对偏差
【发布时间】:2020-09-21 11:26:00
【问题描述】:

我使用一个 4D numpy 数组,在该数组中我沿着数组的第三维计算统计信息 mean, meadin, std,如下所示:

import numpy as np
input_shape = (1, 10, 4)
n_sample =20
X = np.random.uniform(0,1, (n_sample,)+input_shape)
X.shape
(20, 1, 10, 4)

然后我以这种方式计算mean, med,std-dev

sta_fuc = (np.mean, np.median, np.std)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

这样:

stat.shape
(20, 1, 3, 4)

表示沿该维度的mean, medianstd 的值。

然后我想添加列的平均绝对偏差mad 的值,以便统计信息为 (mean, median, std, mad),但似乎numpy 没有为此提供函数。如何将mad添加到我的统计信息中?

编辑

至于第一个答案,使用定义的函数,即:

def mad(arr, axis=None, keepdims=True):
    median = np.median(arr, axis=axis, keepdims=True)
    mad = np.median(np.abs(arr-median, axis=axis, keepdims=keepdims),
                    axis=axis, keepdims=keepdims)
    return mad

然后将mad 添加到统计信息中,这会产生错误,如下所示:

sta_fuc = (np.mean, np.median, np.std, mad)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-22-dab51665f952> in <module>()
      1 sta_fuc = (np.mean, np.median, np.std, mad)
----> 2 stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

1 frames

<ipython-input-21-84d735c8c516> in mad(arr, axis, keepdims)
      1 def mad(arr, axis=None, keepdims=True):
      2     median = np.median(arr, axis=axis, keepdims=True)
----> 3     mad = np.median(np.abs(arr-median, axis=axis, keepdims=keepdims),
      4                     axis=axis, keepdims=keepdims)
      5     return mad

TypeError: 'axis' is an invalid keyword to ufunc 'absolute'

EDIT-2

使用@Jussi 建议的scipy 函数也会产生如下错误: from scipy.stats import median_absolute_deviation as mad

sta_fuc = (np.mean, np.median, np.std, mad)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

TypeError: median_absolute_deviation() got an unexpected keyword argument 'keepdims'

【问题讨论】:

    标签: python numpy multidimensional-array numpy-ndarray


    【解决方案1】:

    我不知道使用 numpy 的内置解决方案。但是您可以使用 mad = median(abs(a - median(a))) 相当容易地基于 numpy 函数实现它。

    def mad(arr, axis=None, keepdims=True):
        median = np.median(arr, axis=axis, keepdims=True)
        mad = np.median(np.abs(arr-median),axis=axis, keepdims=keepdims)
        return mad
    

    【讨论】:

    • 谢谢,使用此函数会生成 'axis' is an invalid keyword to ufunc 'absolute' error as in the question edit。
    • 现在应该修复
    【解决方案2】:

    通常,我看到 MAD 指的是中值绝对偏差。如果这是您想要的,它可以在 SciPy 库中以 scipy.stats.median_absolute_deviation() 的形式提供。

    自己编写合适的函数也很容易。

    编辑:这是一个带有 keepdims 参数的 MAD 函数:

    def mad(data, axis=None, scale=1.4826, keepdims=False):
        """Median absolute deviation (MAD).
        
        Defined as the median absolute deviation from the median of the data. A
        robust alternative to stddev. Results should be identical to
        scipy.stats.median_absolute_deviation(), which does not take a keepdims
        argument.
    
        Parameters
        ----------
        data : array_like
            The data.
        scale : float, optional
            Scaling of the result. By default, it is scaled to give a consistent
            estimate of the standard deviation of values from a normal
            distribution.
        axis : numpy axis spec, optional
            Axis or axes along which to compute MAD.
        keepdims : bool, optional
            If this is set to True, the axes which are reduced are left in the
            result as dimensions with size one.
    
        Returns
        -------
        ndarray
            The MAD.
        """
        # keep dims here so that broadcasting works
        med = np.median(data, axis=axis, keepdims=True)
        abs_devs = np.abs(data - med)
        return scale * np.median(abs_devs, axis=axis, keepdims=keepdims)
    

    【讨论】:

    • @super_ask 我添加了一个采用 keepdims 的 mad() 函数。
    猜你喜欢
    • 1970-01-01
    • 2017-08-09
    • 2013-12-21
    • 2021-04-19
    • 2015-07-03
    • 1970-01-01
    • 2020-06-08
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多