如何使用 matplotlib 将误差线添加到具有权重的直方图？答案

【问题标题】：How to add error bars to histograms with weights using matplotlib?如何使用 matplotlib 将误差线添加到具有权重的直方图？
【发布时间】：2021-03-29 11:10:56
【问题描述】：

我使用我的实验数据的 matplotlib 创建了一个直方图，其中包括测量值和重量。使用 plt.hist 的 weights 参数将事件加权在一起没有问题，但是当我查看错误栏的选项时，似乎没有考虑到事件权重。这个问题的解决方案是泊松错误或到处都使用相同的错误，比如this one，但这并不能解决我的问题。

一个 bin 的误差应在数学上计算为 err(bin) = sqrt( sum {w_i^2} ) 其中 w_i 是属于该 bin 的事件的各个权重。

下面给出了我的直方图的简化示例。

import matplotlib.pyplot as plt

data=[1,8,5,4,1,10,8,3,6,7]
weights=[1.3,0.2,0.01,0.9,0.4,1.05,0.6,0.6,0.8,1.8]

plt.hist(data, bins = [0.0,2.5,5.0,7.5,10.0], weights=weights) 
plt.show()

【问题讨论】：

您想将每个 bin 的加权标准差绘制为误差？
不，权重不是 stddev 而是重要性，其中权重越高意味着对最终结果越重要。
是的，我的意思是使用权重来计算标准差？像sum [(bin_center-x)**2 * weight for x in data_in_bin] / total_bin_weights 这样的东西？（在 cmets 中格式化数学很丑，但我希望你明白了）
好吧，编辑简化了事情（加权标准差相当棘手）。
我想你是对的，我可以做到。我已经添加了关于如何在不通过 stddev 的情况下对重要性进行加权的数学。

标签： python matplotlib

【解决方案1】：

您必须手动计算每个 bin 的错误并单独绘制。

import matplotlib.pyplot as plt  # type: ignore
import numpy as np  # type: ignore

data = np.array([1, 8, 5, 4, 1, 10, 8, 3, 6, 7])
weights = np.array([1.3, 0.2, 0.01, 0.9, 0.4, 1.05, 0.6, 0.6, 0.8, 1.8])

bin_edges = [0.0, 2.5, 5.0, 7.5, 10.0]

bin_y, _, bars = plt.hist(data, bins=bin_edges, weights=weights)
print(f"bin_y {bin_y}")
print(f"bin_edges {bin_edges}")

errors = []
bin_centers = []

for bin_index in range(len(bin_edges) - 1):

    # find which data points are inside this bin
    bin_left = bin_edges[bin_index]
    bin_right = bin_edges[bin_index + 1]
    in_bin = np.logical_and(bin_left < data, data <= bin_right)
    print(f"in_bin {in_bin}")

    # filter the weights to only those inside the bin
    weights_in_bin = weights[in_bin]
    print(f"weights_in_bin {weights_in_bin}")

    # compute the error however you want
    error = np.sqrt(np.sum(weights_in_bin ** 2))
    errors.append(error)
    print(f"error {error}")

    # save the center of the bins to plot the errorbar in the right place
    bin_center = (bin_right + bin_left) / 2
    bin_centers.append(bin_center)
    print(f"bin_center {bin_center}")

# plot the error bars
plt.errorbar(bin_centers, bin_y, yerr=errors, linestyle="none")

plt.show()

产生这个

当你添加编辑时，我已经用 stddev 为每个 bin 完成了绘图，只需将 errors 更改为 stddevs，计算为

data_in_bin = data[in_bin]
variance = np.average((data_in_bin - bin_center) ** 2, weights=weights_in_bin)
stddev = np.sqrt(variance)
print(f"stddev {stddev}")
stddevs.append(stddev)

但是您应该检查 stddev 计算是否对您的用例有意义。这导致：

干杯！

【讨论】：