【问题标题】:How to subplot multiple KDE distributions, multiple categories in one column如何在一列中绘制多个 KDE 分布、多个类别
【发布时间】:2022-01-19 18:28:32
【问题描述】:

我正在尝试绘制 4 个 KDE 图形(子图)或 1 个 4 线。 我有两列:

Region:      Charges:

southeast    6000
southeast    5422
southwest    3222
northwest    4222
northwest    5555
northeast    6729
etc 1000s of rows..4 regions

我想可视化这 4 个区域的分布。

玩弄这个和错误消息(我知道这是不正确的)'Data must be 1-dimensional'

fig, axes = plt.subplots(2, 2, sharex=True, figsize=(10,5))
fig.suptitle('Bigger 1 row x 2 columns axes with no data')
#axes[0].set_title('Title of the first chart')
reg_name = df2[['region','charges']].set_index('region')
southeast = reg_name.loc['southeast']
southwest = reg_name.loc['southwest']
northwest = reg_name.loc['northwest']

#c = df2.charges.values
#d = df2.region
# Set the dimensions of the plot
#widthInInches = 10
#heightInInches = 4
#plt.figure( figsize=(widthInInches, heightInInches) )
# Draw histograms and KDEs on the diagonal usin
#if( int(versionStrParts[1]) < 11 ):
# Use the older, now-deprectaed form
#   ax = sns.distplot(c,
#      kde_kws={"label": "Kernel Density", "color" : "black"},
#      hist_kws={"label": "Histogram", "color" : 'lightsteelblue'})
#else:
# Use the more recent for

sns.kdeplot(ax=axes[0], x=southeast.index, y=southeast.values, color="black", label="Kernel Density")
axes[0].set_title(southeast.name)

sns.kdeplot(ax=axes[1], x=southwest.index, y=southwest.values, color="black", label="Kernel Density")
axes[1].set_title(southwest.name)

【问题讨论】:

    标签: python matplotlib seaborn


    【解决方案1】:

    sns.kdeplot(ax=axes[0,0], data=df2[df2['region']=='southeast'], x='charges', color='k') 应该适用于您的数据。注意axes是一个二维数组,当行数和列数都大于1时。

    请参阅How to plot a mean line on a distplot between 0 and the y value of the mean? 了解为 mean、sdev 等添加行。

    sns.displot 可以一口气画出kdeplots,而不是一一绘制(注意displotdistplot 不同):

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import numpy as np
    
    np.random.seed(12358)
    regions = ['southeast', 'southwest', 'northeast', 'northwest']
    df2 = pd.DataFrame({'region': np.repeat(regions, 100),
                        'charge': np.round(np.random.randn(400).cumsum() * 100 + 2000)})
    
    g = sns.displot( kind='kde', data=df2, x='charge',
                     col='region', col_order=regions, col_wrap=2,
                     height=4, aspect=3, color='black')
    for region,ax in g.axes_dict.items():
        data = df2[df2['region'] == region]['charge'].values
        xs, ys = ax.get_lines()[0].get_data()
        median = np.median(data)
        mean = data.mean()
        sdev = data.std()
        ax.vlines([mean-sdev, mean, mean+sdev], 0, np.interp([mean-sdev, mean, mean+sdev], xs, ys), color='b', ls=':')
        ax.vlines(median, 0, np.interp(median, xs, ys), color='r', ls='--')
    plt.tight_layout()
    plt.show()
    

    要将所有区域绘制到一个图中,您可以使用:

    fig, ax = plt.subplots(figsize=(12, 4))
    sns.kdeplot(data=df2, x='charge', hue='region', ax=ax)
    

    【讨论】:

    • 如果这回答了您的问题,请随时通过单击复选标记将答案标记为已接受,将其从灰色变为绿色
    • 在 Python 中绘制图表仍然是一大难题。我已经做过很多次了,我总是不得不从某个地方复制它。谢谢你这么详细的回复!
    猜你喜欢
    • 2021-03-03
    • 1970-01-01
    • 1970-01-01
    • 2017-10-30
    • 2021-04-24
    • 1970-01-01
    • 2017-03-29
    • 1970-01-01
    • 2021-11-03
    相关资源
    最近更新 更多