【问题标题】：How to plot a histogram for all unique combinations of data?如何为所有独特的数据组合绘制直方图？
【发布时间】：2020-10-18 06:35:17
【问题描述】：

有没有一种方法可以在 python 中的特定日期的不同场景下获得人口的大小频率直方图
- 表示带有误差线
我的数据格式类似于此表：

SCENARIO     RUN     MEAN     DAY
A             1       25       10
A             1       15       30
A             2       20       10
A             2       27       30
B             1       45       10
B             1       50       30
B             2       43       10
B             2       35       30

results_data.groupby(['Scenario', 'Run']).mean() 没有给我想要可视化数据的日子
- 它返回每次运行天数的平均值。

【问题讨论】：

标签： python pandas matplotlib pandas-groupby seaborn

【解决方案1】：

使用`seaborn.FacetGrid`

FactGrid 是用于绘制条件关系的多图网格
将seaborn.distplot 映射到FacetGrid 并使用hue=DAY。

设置数据和数据帧

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import random  # just for test data
import numpy as np  # just for test data


# data
random.seed(365)
np.random.seed(365)
data = {'MEAN': [np.random.randint(20, 51) for _ in range(500)],
        'SCENARIO': [random.choice(['A', 'B']) for _ in range(500)],
        'DAY': [random.choice([10, 30]) for _ in range(500)],
        'RUN': [random.choice([1, 2]) for _ in range(500)]}

# create dataframe
df = pd.DataFrame(data)

使用`kde=False` 绘图

g = sns.FacetGrid(df, col='RUN', row='SCENARIO', hue='DAY', height=5)
g = g.map(sns.distplot, 'MEAN', bins=range(20, 51, 5), kde=False, hist_kws=dict(edgecolor="k", linewidth=1)).add_legend()
plt.show()

使用`kde=True` 绘图

g = sns.FacetGrid(df, col='RUN', row='SCENARIO', hue='DAY', height=5, palette='GnBu')
g = g.map(sns.distplot, 'MEAN', bins=range(20, 51, 5), kde=True, hist_kws=dict(edgecolor="k", linewidth=1)).add_legend()
plt.show()

带有误差线的绘图

使用how to add error bars to histogram diagram in python
从上方使用df
使用matplotlib.pyplot.errorbar 在直方图上绘制误差线。

from itertools import product

# create unique combinations for filtering df
scenarios = df.SCENARIO.unique()
runs = df.RUN.unique()
days = df.DAY.unique()
combo_list = [scenarios, runs, days]
results = list(product(*combo_list))  

# plot
for i, result in enumerate(results, 1):  # iterate through each set of combinations
    s, r, d = result
    data = df[(df.SCENARIO == s) & (df.RUN == r) & (df.DAY == d)]  # filter dataframe
    
    # add subplot rows, columns; needs to equal the number of combinations in results
    plt.subplot(2, 4, i)
    
    # plot hist and unpack values
    n, bins, _ = plt.hist(x='MEAN', bins=range(20, 51, 5), data=data, color='g')
    
    # calculate bin centers
    bin_centers = 0.5 * (bins[:-1] + bins[1:])
    
    # draw errobars, use the sqrt error. You can use what you want there
    # poissonian 1 sigma intervals would make more sense
    plt.errorbar(bin_centers, n, yerr=np.sqrt(n), fmt='k.')


    plt.title(f'Scenario: {s} | Run: {r} | Day: {d}')
plt.tight_layout()
plt.show()

【讨论】：

很棒的课程！:) @TrentonMcKinney

使用seaborn.FacetGrid

设置数据和数据帧

使用kde=False 绘图

使用kde=True 绘图

带有误差线的绘图

使用`seaborn.FacetGrid`

使用`kde=False` 绘图

使用`kde=True` 绘图