【问题标题】:Making two seaborn countplots that share the same axis制作两个共享同一轴的 seaborn 计数图
【发布时间】:2019-10-14 07:21:08
【问题描述】:

我希望使用 seaborn 计数图在一个轴上显示两个不同数据列表的频率分布。我遇到的问题是两个列表都包含唯一元素,所以我不能简单地使用较大列表的轴绘制一个列表。

我尝试过使用 python 的 count 对象,但是由于 python 字典是无序的,所以图形的轴与图形上显示的计数不匹配。

import seaborn as sns


first_list = ["a", "b", "c", "d", "e", "a", "b", "c", "a", "b","n"]
second_list = ["a","b","c","d", "e", "e","d","c","e","q"]


sns.countplot(first_list, color="blue", alpha=.5)
sns.countplot(second_list, color="red",alpha=.5)


plt.show()

上面的代码应该显示一个图表,其中包含唯一值“n”和“q”的频率,但显示的图表的轴仅包含第二个列表中的值。

【问题讨论】:

    标签: python plot seaborn frequency-distribution


    【解决方案1】:

    我认为最好将您的数据合并到一个数据框中,然后传递给 seaborn,而不是在彼此之上制作两个图。我在计数上调用了 sns.barplot,而不是在原始原始值上使用 sns.countplot。

    #convert the lists to series and get the counts
    first_list = pd.Series(
        ["a", "b", "c", "d", "e", "a", "b", "c", "a", "b","n"]
    ).value_counts()
    
    second_list = pd.Series(
        ["a","b","c","d", "e", "e","d","c","e","q"]
    ).value_counts()
    
    #get the counts as a dataframe
    df=pd.concat([first_list,second_list],axis=1)
    df.columns=['first','second']
    
    # melt the data frame so it has a "tidy" data format
    df=df.reset_index().melt(id_vars=['index'])
    
    df
    
       index variable  value
    0      a    first    3.0
    1      b    first    3.0
    2      c    first    2.0
    3      d    first    1.0
    4      e    first    1.0
    5      n    first    1.0
    6      q    first    NaN
    7      a   second    1.0
    8      b   second    1.0
    9      c   second    2.0
    10     d   second    2.0
    11     e   second    3.0
    12     n   second    NaN
    13     q   second    1.0
    
    
    
    #plot a bar graph and assign variable to hue
    sns.barplot(
        x='index',
        y='value',
        hue='variable',
        data=df,
        palette=['blue','red'],
        alpha=.5,
        dodge=False,
    )
    
    plt.show()
    

    【讨论】:

    • 感谢您的快速回复!我很好奇是否有办法在避免创建数据框的同时做到这一点。我希望绘制一个相对较大的数据集,并且我希望避免的数据框似乎存在一些开销。
    【解决方案2】:

    我不知道有任何直接的方法可以在不首先创建数据框的情况下按照您想要的方式使用 seaborn 计数图。这是一个基于this example 使用 numpy 和 matplotlib 构建的解决方案。我让你来检查这是否比使用数据框和计数图更有效。

    import numpy as np                # v 1.19.2
    import matplotlib.pyplot as plt   # v 3.3.2
    
    first_list = ["a", "b", "c", "d", "e", "a", "b", "c", "a", "b", "n"]
    second_list = ["a", "b", "c", "d", "e", "e", "d", "c", "e", "q"]
    
    # Create dictionaries from lists with this format: 'letter':count
    dict1 = dict(zip(*np.unique(first_list, return_counts=True)))
    dict2 = dict(zip(*np.unique(second_list, return_counts=True)))
    
    # Add missing letters with count=0 to each dictionary so that keys in
    # each dictionary are identical
    only_in_set1 = set(dict1)-set(dict2)
    only_in_set2 = set(dict2)-set(dict1)
    dict1.update(dict(zip(only_in_set2, [0]*len(only_in_set2))))
    dict2.update(dict(zip(only_in_set1, [0]*len(only_in_set1))))
    
    # Sort dictionaries alphabetically
    dict1 = dict(sorted(dict1.items()))
    dict2 = dict(sorted(dict2.items()))
    
    # Create grouped bar chart
    xticks = np.arange(len(dict1))
    bar_width = 0.3
    fig, ax = plt.subplots(figsize=(9, 5))
    ax.bar(xticks-bar_width/2, dict1.values(), bar_width,
           color='blue', alpha=0.5, label='first_list')
    ax.bar(xticks+bar_width/2, dict2.values(), bar_width,
           color='red', alpha=0.5, label='second_list')
    
    # Set annotations, x-axis ticks and tick labels
    ax.set_ylabel('Counts')
    ax.set_title('Letter counts grouped by list')
    ax.set_xticks(xticks)
    ax.set_xticklabels(dict1.keys())
    ax.legend(frameon=False)
    plt.show()
    

    【讨论】:

      猜你喜欢
      • 2020-03-06
      • 1970-01-01
      • 1970-01-01
      • 2016-05-07
      • 2021-01-19
      • 2010-11-27
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多