【问题标题】:Seaborn heatmap colobar: how to assure the correct order of classes and correct colors displayedSeaborn heatmap colorbar:如何确保正确的类顺序和正确的颜色显示
【发布时间】:2021-04-08 18:28:40
【问题描述】:

我有一个数据框,其中包含某个计算的结果,我想将其绘制为带有彩条的 seaborn 热图。我正在使用以下代码来实现这一点(主要取自这里:enter link description here):

# input data
results = [['equal','equal','smaller','smaller or equal','greater or equal'],   
           ['equal','equal','smaller','smaller','greater or equal'],                                      
           ['greater','equal','smaller or equal','smaller','smaller'],
           ['equal','smaller or equal','greater or equal','greater or equal','equal'],
           ['equal','equal','smaller','equal','equal']]

index = ['axc', 'org', 'cf5', 'cm1', 'ext']
columns = ['axc', 'org', 'cf5', 'cm1', 'ext']

# create a dataframe
res_df = pd.DataFrame(results, columns, index) 

value_to_int = {j:i for i,j in enumerate(['greater','greater or equal','equal','smaller or equal','smaller'])}

n = len(value_to_int)     

# discrete colormap (n samples from a given cmap)
cmap = sns.color_palette("viridis", n) 
ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap) 

# modify colorbar:
colorbar = ax.collections[0].colorbar 
r = colorbar.vmax - colorbar.vmin 
colorbar.set_ticks([colorbar.vmin + r / n * (0.5 + i) for i in range(n)])
colorbar.set_ticklabels(list(value_to_int.keys()))                                          
plt.show()

它在大多数时候就像一个魅力,但如果索引列表中的一个类不存在,就会出现问题。为了演示,如果您像这样更改数据框:

results_changed = [['equal','equal','smaller','smaller or equal','greater or equal'],
              ['equal','equal','smaller','smaller','greater or equal'],
              ['greater or equal','equal','smaller or equal','smaller','smaller'],
              ['equal','smaller or equal','greater or equal','greater or equal','equal'],
              ['equal','equal','smaller','equal','equal']]

index = ['axc', 'org', 'cf5', 'cm1', 'ext']
columns = ['axc', 'org', 'cf5', 'cm1', 'ext']

# create a dataframe
res_df = pd.DataFrame(results_changed, columns, index) 

value_to_int = {j:i for i,j in enumerate(['greater','greater or equal','equal','smaller or equal','smaller'])}

n = len(value_to_int)  

# discrete colormap (n samples from a given cmap)
cmap = sns.color_palette("viridis", n) 
ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap) 

# modify colorbar:
colorbar = ax.collections[0].colorbar 
r = colorbar.vmax - colorbar.vmin 
colorbar.set_ticks([colorbar.vmin + r / n * (0.5 + i) for i in range(n)])
colorbar.set_ticklabels(list(value_to_int.keys()))                                          
plt.show()  

然后继续绘图,生成的热图将为类分配错误的颜色——因为现在没有“更大”的情况,它会“移动”调色板,并且不会像以前那样为 equal 分配正确的颜色。

我试图通过更改代码中的这一行来解决问题:

value_to_int = {j:i for i,j in enumerate(pd.unique(res_df.values.ravel()))}

虽然它解决了颜色分配问题,但它会产生另一个问题,因为颜色条会弄乱颜色的顺序(我想避免这种情况)。

谁能建议如何解决这个问题?如有任何建议,我将不胜感激。

【问题讨论】:

    标签: seaborn heatmap colorbar


    【解决方案1】:

    确保在不同条件下的可比性的最佳方法是始终将颜色条限制在相同的水平:

    import pandas as pd
    from matplotlib import pyplot as plt
    import seaborn as sns
    
    results_changed = [['equal','equal','smaller','smaller or equal','greater or equal'],
                  ['equal','equal','smaller','smaller','greater or equal'],
                  ['greater or equal','equal','smaller or equal','smaller','smaller'],
                  ['equal','smaller or equal','greater or equal','greater or equal','equal'],
                  ['equal','equal','smaller','equal','equal']]
    
    index = ['axc', 'org', 'cf5', 'cm1', 'ext']
    columns = ['axc', 'org', 'cf5', 'cm1', 'ext']
    
    # create a dataframe
    res_df = pd.DataFrame(results_changed, columns, index) 
    
    #construct dictionary from ordered list
    category_order = ['greater', 'greater or equal', 'equal', 'smaller or equal', 'smaller']    
    value_to_int = {j:i for i,j in enumerate(category_order)}    
    n = len(value_to_int)  
    
    # discrete colormap (n samples from a given cmap)
    cmap = sns.color_palette("viridis", n) 
    ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap, vmin=0, vmax=n) 
    
    #modify colorbar:
    colorbar = ax.collections[0].colorbar 
    colorbar.set_ticks([0.5 + i for i in range(n)])
    colorbar.set_ticklabels(category_order)                                          
    plt.show()  
    

    样本输出:

    如果您只想在颜色栏中显示实际存在的颜色,您可以预先过滤现有类别的列表,但这会改变不同输入数组的颜色方案,使它们难以比较。

    import pandas as pd
    from matplotlib import pyplot as plt
    import seaborn as sns
    import numpy as np
    
    results_changed = [['equal','equal','smaller','smaller or equal','greater'],
                  ['equal','equal','smaller','smaller','greater'],
                  ['greater','equal','smaller','smaller','smaller'],
                  ['equal','smaller','greater','greater','equal'],
                  ['equal','equal','smaller','equal','equal']]
    
    index = ['axc', 'org', 'cf5', 'cm1', 'ext']
    columns = ['axc', 'org', 'cf5', 'cm1', 'ext']
    
    # create a dataframe
    res_df = pd.DataFrame(results_changed, columns, index) 
    
    unique_results = np.unique(results_changed)
    unique_categories = [cat for cat in ['greater','greater or equal','equal','smaller or equal','smaller'] if cat in unique_results]
    
    value_to_int = {j:i for i,j in enumerate(unique_categories)}
    
    n = len(value_to_int)  
    
    # discrete colormap (n samples from a given cmap)
    cmap = sns.color_palette("viridis", n) 
    ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap) 
    
    #modify colorbar:
    colorbar = ax.collections[0].colorbar 
    r = colorbar.vmax - colorbar.vmin 
    colorbar.set_ticks([colorbar.vmin + r / n * (0.5 + i) for i in range(n)])
    colorbar.set_ticklabels(unique_categories)
    plt.show()  
    

    样本输出:

    【讨论】:

    • 非常感谢您的详细解答。这正是我所需要的!我也非常感谢您的清晰解释。
    猜你喜欢
    • 2018-12-05
    • 2015-11-23
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-07-14
    • 2021-12-24
    相关资源
    最近更新 更多