【问题标题】:Dummy variable is not created未创建虚拟变量
【发布时间】:2020-12-29 06:45:51
【问题描述】:

我在创建虚拟变量时遇到问题“town”列有问题。 它给出了一个关键错误,但我的语法是正确的。 请帮助我,即使我是正确的,我也不明白是什么问题。

import pandas as pd
import numpy as np
df= pd.read_csv('homeprices.csv')
df

dummies=pd.get_dummies(df['town'])
dummies

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
c:\users\saurabh singh\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2894             try:
-> 2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'town'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-30-b0961e3e5942> in <module>
      1 # df = pd.concat([df, pd.get_dummies(df['town'])], axis=1)
----> 2 dummies=pd.get_dummies(df['town'])
      3 dummies

c:\users\saurabh singh\appdata\local\programs\python\python37\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2904             if self.columns.nlevels > 1:
   2905                 return self._getitem_multilevel(key)
-> 2906             indexer = self.columns.get_loc(key)
   2907             if is_integer(indexer):
   2908                 indexer = [indexer]

c:\users\saurabh singh\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:
-> 2897                 raise KeyError(key) from err
   2898 
   2899         if tolerance is not None:

KeyError: 'town'
df.columns
Index(['town ', 'area', 'price'], dtype='object')

【问题讨论】:

  • 你能把df.columns的结果贴在这里吗?
  • 如您所见,“town”后面有一个空格,它是列名的一部分,因此您会收到关键错误。

标签: pandas dummy-variable


【解决方案1】:

您的专栏town 上有一个空格。

如下更改列名

df.columns = ['town', 'area', 'price']

在此之后,您可以使用

dummies=pd.get_dummies(df['town'])

或者只是将df['town'] 更改为df['town ']

【讨论】:

    猜你喜欢
    • 2023-03-27
    • 2018-04-06
    • 2020-01-31
    • 2017-02-24
    • 2012-09-27
    • 2023-03-14
    • 1970-01-01
    • 2014-04-01
    • 2017-04-23
    相关资源
    最近更新 更多