跨 3D NumPy 数组切片计算统计数据

【问题标题】：Calculating statistics across slices of 3D NumPy array跨 3D NumPy 数组切片计算统计数据
【发布时间】：2022-11-05 19:04:10
【问题描述】：

抱歉，如果这个问题已经得到解答，但我找不到好的解决方案。

我有一个尺寸为（1e5、1e3、1e3）的大型 3D numpy 数组，我需要计算第一个维度的每个切片的 SciPy 统计量（Weibull 参数）。嵌套的 for 循环可以完成工作，但显然并不理想。我查看了 NumPy 的 apply_along_axis 和 apply_over_axes 函数，但它们并没有加快速度。

示例代码

a = np.random((1e5, 1e3, 1e3))
stat = np.empty((1e3, 1e3))

for y in a.shape[1]:
    for z in a.shape[2]:
        stat[y,z] = calculate_statistic(a[:,y,z])

非常感谢！

【问题讨论】：

标签： numpy-ndarray

【解决方案1】：

理论上这应该有效：

import numpy as np

a = np.ones((1000, 100, 100))  # fill with your input
stat = np.empty((100, 100))  # result

def calculate_statistic(array1d):  # 1D array as the parameter
    return sum(array1d)  # example function

stat = calculate_statistic(a)

print(stat.shape)  # returns (100, 100)
print(stat)  # returns an array with calculate_statistic(a[:,y,z]) for each y, z

但是，至少在我的机器上，我无法为您的巨大数组a = np.random((int(1e5), int(1e3), int(1e3))) 分配 745 GiB 的内存。也许我误解了你的问题？

apply_over_axes 在这种情况下没有用，因为它执行与您建议的相同的循环计算。如果上述建议不起作用，更好的解决方案是通过@np.vectorize。

【讨论】：