如何获得嵌套 numpy 数组的每一列的所有平均值？答案

【问题标题】：How can I get all of the average of values of each respective column of nested numpy arrays?如何获得嵌套 numpy 数组的每一列的所有平均值？
【发布时间】：2017-12-11 05:03:34
【问题描述】：

我在为 dim-2 numpy 数组的每一列执行按列操作时遇到问题。我正在尝试使我的案例适应this answer，尽管我的设置不同。我的实际数据集非常大，涉及多次重采样，因此是下面示例的语法。如果代码和解释看起来太长，请考虑跳到标题 Relevant。

可跳过（仅在此处复制zs 下方）

考虑一个 (x_n, y_n) 数据集，其中 n = 0, 1, or 2。

def get_xy(num, size=10):
    ## (x1, y1), (x2, y2), (x3, y3) where xi, yi are both arrays
    if num == 0:
        x = np.linspace(7, size+6, size)
        y = np.linspace(3, size+2, size)
    elif num == 1:
        x = np.linspace(5, size+4, size)
        y = np.linspace(2, size+1, size)
    elif num == 2:
        x = np.linspace(4, size+3, size)
        y = np.linspace(1, size, size)
    return x, y

假设我们可以计算一些度量 z_n 给定数组 x_n 和 y_n。

def get_single_z(x, y, constant=2):
    deltas = [x[i] - y[i] for i in range(len(x)) if len(x) == len(y)]
    return constant * np.array(deltas)

我们可以一次计算所有z_n，而不是单独计算每个z_n。

def get_all_z(constant=2):
    zs = []
    for num in range(3): ## 0, 1, 2
        xs, ys = get_xy(num)
        zs.append(get_single_z(xs, ys, constant))
    zs = np.array(zs)
    return zs

相关：

zs = get_all_z()
print(zs)
>> [[ 8.  8.  8.  8.  8.  8.  8.  8.  8.  8.]
    [ 6.  6.  6.  6.  6.  6.  6.  6.  6.  6.]
    [ 6.  6.  6.  6.  6.  6.  6.  6.  6.  6.]]

出于我的目的，我想创建一个新列表或数组vs，其中每个索引处的值等于zs 对应列中值的平均值。对于这种情况，vs 的每个元素都是相同的（因为每个操作都是 [8,6,6] 的平均值）。但是如果第一个子数组的第一个元素是 10 而不是 8，那么 vs 的第一个元素将是 [10, 6, 6] 的平均值。

不成功的尝试：

def get_avg_per_col(z):
    ## column ?= axis number
    return [np.mean(z, axis=i) for i in range(len(zs[0]))]

print(get_avg_per_col(zs))
Traceback (most recent call last):...
...line 50, in _count_reduce_items ## of numpy code, not my code
    items *= arr.shape[ax]
IndexError: tuple index out of range

【问题讨论】：

那么，tldr，你想找到每一行的平均值，对吧？
每列的平均值，而不是行的平均值。
所以你会有 10 个平均值？不是 3？
没错，我的目标是创建一个包含 10 个均值的新列表或数组（每列一个均值）。
明白了。我的回答应该可以解决这个问题。如果有任何问题，请告诉我。

标签： arrays python-3.x numpy multidimensional-array error-handling

【解决方案1】：

您可以在转置的zs 上使用np.mean 来获得列均值。

In [49]: import numpy as np

In [53]: zs = np.array([[ 8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.],
    ...:  [ 6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.],
    ...:  [ 6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.]])

In [54]: np.mean(zs.T, axis=1)
Out[54]: 
array([ 6.66666667,  6.66666667,  6.66666667,  6.66666667,  6.66666667,
        6.66666667,  6.66666667,  6.66666667,  6.66666667,  6.66666667])

【讨论】：

非常不错，而且速度一如既往地快！是一个加号