【问题标题】:Unknown cause of seaborn.violinplot() KeyErrorseaborn.violinplot() KeyError 的未知原因
【发布时间】:2021-02-07 14:13:26
【问题描述】:

我正在为 Kaggle 上的房价竞赛创建探索性数据分析,但遇到了 seaborn.violinplot() 函数的问题:

我想用函数绘制LotFrontage,但出现以下错误:

sns.violinplot(data=houseprices_num['LotFrontage'], inner='quartile', color='white')
KeyError                                  Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2894             try:
-> 2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 0

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-31-01b815903af9> in <module>
      1 #houseprices_lfnotna = houseprices[houseprices['LotFrontage'].notna()]
----> 2 sns.violinplot(data=houseprices_num['LotFrontage'], inner='quartile', color='white')

~\anaconda3\lib\site-packages\seaborn\_decorators.py in inner_f(*args, **kwargs)
     44             )
     45         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46         return f(**kwargs)
     47     return inner_f
     48 

~\anaconda3\lib\site-packages\seaborn\categorical.py in violinplot(x, y, hue, data, order, hue_order, bw, cut, scale, scale_hue, gridsize, width, inner, split, dodge, orient, linewidth, color, palette, saturation, ax, **kwargs)
   2385 ):
   2386 
-> 2387     plotter = _ViolinPlotter(x, y, hue, data, order, hue_order,
   2388                              bw, cut, scale, scale_hue, gridsize,
   2389                              width, inner, split, dodge, orient, linewidth,

~\anaconda3\lib\site-packages\seaborn\categorical.py in __init__(self, x, y, hue, data, order, hue_order, bw, cut, scale, scale_hue, gridsize, width, inner, split, dodge, orient, linewidth, color, palette, saturation)
    520                  color, palette, saturation):
    521 
--> 522         self.establish_variables(x, y, hue, data, orient, order, hue_order)
    523         self.establish_colors(color, palette, saturation)
    524         self.estimate_densities(bw, cut, scale, scale_hue, gridsize)

~\anaconda3\lib\site-packages\seaborn\categorical.py in establish_variables(self, x, y, hue, data, orient, order, hue_order, units)
     96                 if hasattr(data, "shape"):
     97                     if len(data.shape) == 1:
---> 98                         if np.isscalar(data[0]):
     99                             plot_data = [data]
    100                         else:

~\anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    880 
    881         elif key_is_scalar:
--> 882             return self._get_value(key)
    883 
    884         if is_hashable(key):

~\anaconda3\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
    987 
    988         # Similar to Index.get_value, but we do not fall back to positional
--> 989         loc = self.index.get_loc(label)
    990         return self.index._get_values_for_loc(self, loc, label)
    991 

~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:
-> 2897                 raise KeyError(key) from err
   2898 
   2899         if tolerance is not None:

KeyError: 0

之前对 houseprices_num 所做的修改可能(但不可能)导致错误:

traindf = pd.read_csv('.\\train.csv', sep=',', header=1, index_col='Id')
testdf = pd.read_csv('.\\test.csv', sep=',', header=0, index_col='Id')
houseprices = pd.concat([traindf, testdf], axis=0)
houseprices_num = houseprices[['LotFrontage', 'LotArea', 'MasVnrArea', 'BsmtFinSF1', 'BsmtFinSF2', 'BsmtUnfSF',
                              'TotalBsmtSF', '1stFlrSF', '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'GarageArea', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch',
                              '3SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal', 'SalePrice']]

从错误消息中,我怀疑 inner= kwarg 可能是负责任的,但根据 seaborn 文档,'quartile' 是一个有效参数。我还尝试在没有空值的情况下绘制LotFrontage,但没有成功。

我希望能解释一下这个错误,以及如何解决它。 谢谢!

【问题讨论】:

  • 确认您的 traindf 和 testdf 连接有效。错误消息告诉您形状不正确。
  • houseprices 有 2919 行,等于 traindf (1460) 行 + testdf (1459) 行
  • houseprices 也有 80 列,和 traindf 一样

标签: python pandas seaborn


【解决方案1】:

如果你真的只想要1把小提琴来看看这个专栏的分布,你可以这样做:

import seaborn as sns
import numpy as np
sns.violinplot(data=houseprices_num['LotFrontage'].values, 
inner='quartile', color='white')

我认为如果您调用数据框会导致一些问题。如果您想将小提琴情节转为其他类别,您可以这样做:

houseprices_num['GarageArea_bin'] = pd.cut(houseprices_num['GarageArea'],2,
labels=['A','B'])
sns.violinplot(data=houseprices_num,x='GarageArea_bin',y='LotFrontage',
inner='quartile', color='white')

【讨论】:

    【解决方案2】:

    小提琴图接收数字数据。您提供要绘制的 x 和 y 变量。

     sns.violinplot(data=houseprices_num,y="LoadFrontageDRG Definition", x="LotArea")
     plt.show()
    
     #or for one violin plot
    
     sns.violinplot(houseprices_num["LoadFrontageDRG Definition"])
    
     plt.show()
    

    【讨论】:

    • 虽然此代码可能会为问题提供解决方案,但最好添加有关其工作原理/方式的上下文。这可以帮助未来的用户学习并最终将这些知识应用到他们自己的代码中。解释代码时,您也可能会得到用户的积极反馈/赞成。
    猜你喜欢
    • 1970-01-01
    • 2022-01-25
    • 2020-06-25
    • 2016-07-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多