【发布时间】:2021-02-07 14:13:26
【问题描述】:
我正在为 Kaggle 上的房价竞赛创建探索性数据分析,但遇到了 seaborn.violinplot() 函数的问题:
我想用函数绘制LotFrontage,但出现以下错误:
sns.violinplot(data=houseprices_num['LotFrontage'], inner='quartile', color='white')
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2894 try:
-> 2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-31-01b815903af9> in <module>
1 #houseprices_lfnotna = houseprices[houseprices['LotFrontage'].notna()]
----> 2 sns.violinplot(data=houseprices_num['LotFrontage'], inner='quartile', color='white')
~\anaconda3\lib\site-packages\seaborn\_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
~\anaconda3\lib\site-packages\seaborn\categorical.py in violinplot(x, y, hue, data, order, hue_order, bw, cut, scale, scale_hue, gridsize, width, inner, split, dodge, orient, linewidth, color, palette, saturation, ax, **kwargs)
2385 ):
2386
-> 2387 plotter = _ViolinPlotter(x, y, hue, data, order, hue_order,
2388 bw, cut, scale, scale_hue, gridsize,
2389 width, inner, split, dodge, orient, linewidth,
~\anaconda3\lib\site-packages\seaborn\categorical.py in __init__(self, x, y, hue, data, order, hue_order, bw, cut, scale, scale_hue, gridsize, width, inner, split, dodge, orient, linewidth, color, palette, saturation)
520 color, palette, saturation):
521
--> 522 self.establish_variables(x, y, hue, data, orient, order, hue_order)
523 self.establish_colors(color, palette, saturation)
524 self.estimate_densities(bw, cut, scale, scale_hue, gridsize)
~\anaconda3\lib\site-packages\seaborn\categorical.py in establish_variables(self, x, y, hue, data, orient, order, hue_order, units)
96 if hasattr(data, "shape"):
97 if len(data.shape) == 1:
---> 98 if np.isscalar(data[0]):
99 plot_data = [data]
100 else:
~\anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
880
881 elif key_is_scalar:
--> 882 return self._get_value(key)
883
884 if is_hashable(key):
~\anaconda3\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
987
988 # Similar to Index.get_value, but we do not fall back to positional
--> 989 loc = self.index.get_loc(label)
990 return self.index._get_values_for_loc(self, loc, label)
991
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
-> 2897 raise KeyError(key) from err
2898
2899 if tolerance is not None:
KeyError: 0
之前对 houseprices_num 所做的修改可能(但不可能)导致错误:
traindf = pd.read_csv('.\\train.csv', sep=',', header=1, index_col='Id')
testdf = pd.read_csv('.\\test.csv', sep=',', header=0, index_col='Id')
houseprices = pd.concat([traindf, testdf], axis=0)
houseprices_num = houseprices[['LotFrontage', 'LotArea', 'MasVnrArea', 'BsmtFinSF1', 'BsmtFinSF2', 'BsmtUnfSF',
'TotalBsmtSF', '1stFlrSF', '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'GarageArea', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch',
'3SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal', 'SalePrice']]
从错误消息中,我怀疑 inner= kwarg 可能是负责任的,但根据 seaborn 文档,'quartile' 是一个有效参数。我还尝试在没有空值的情况下绘制LotFrontage,但没有成功。
我希望能解释一下这个错误,以及如何解决它。 谢谢!
【问题讨论】:
-
确认您的 traindf 和 testdf 连接有效。错误消息告诉您形状不正确。
-
houseprices 有 2919 行,等于 traindf (1460) 行 + testdf (1459) 行
-
houseprices 也有 80 列,和 traindf 一样