从数据框列中查找最大元素会出错答案

【问题标题】：finding max element from column of dataframe gives error从数据框列中查找最大元素会出错
【发布时间】：2018-12-09 00:12:51
【问题描述】：

我试图从我的 DataFrame 中的列中查找最大元素，但这会产生以下错误。而且我已经测试过它只会给这个列名带来错误，其余的列都可以正常工作。

这是我从文件 posts1.csv 创建的 DataFrame

import pandas as pd

posts_n = pd.read_csv('posts1.csv',encoding='latin-1')
posts=posts_n.fillna(0)

当我尝试从特定列（即“分数”）中查找最大元素时，

max_post = posts['score'].max()
max_post

我收到以下错误

KeyError                                  Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2441             try:
-> 2442                 return self._engine.get_loc(key)
   2443             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'score'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-12-09c353ba0de2> in <module>()
     34 #MAximum posts done by a user
     35 
---> 36 max_post = posts['score'].max()
     37 max_post
     38 #scr=posts.iloc[:,4]

~\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   1962             return self._getitem_multilevel(key)
   1963         else:
-> 1964             return self._getitem_column(key)
   1965 
   1966     def _getitem_column(self, key):

~\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   1969         # get column
   1970         if self.columns.is_unique:
-> 1971             return self._get_item_cache(key)
   1972 
   1973         # duplicate columns & possible reduce dimensionality

~\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1643         res = cache.get(item)
   1644         if res is None:
-> 1645             values = self._data.get(item)
   1646             res = self._box_item_values(item, values)
   1647             cache[item] = res

~\Anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   3588 
   3589             if not isnull(item):
-> 3590                 loc = self.items.get_loc(item)
   3591             else:
   3592                 indexer = np.arange(len(self.items))[isnull(self.items)]

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2442                 return self._engine.get_loc(key)
   2443             except KeyError:
-> 2444                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2445 
   2446         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'score'

这就是数据的外观 posts1.csv

【问题讨论】：

请将 posts.head() 添加到您的帖子中
@tobsecret 没用
我的意思是在posts=posts_n.fillna(0) 之后，请添加print(posts.head()) 并相应地编辑您的帖子，这样我就知道您的DataFrame 中到底有什么。
我找到了解决方案，这行得通。 posts_n = pd.read_csv('posts1.csv',encoding='latin-1',sep='\s*,\s*')

标签： pandas dataframe machine-learning max slice

【解决方案1】：

'score' 不在(column) index 中，因此不是将 csv 的第一行作为标题行加载，而是将其作为数据读取。

尝试以下方法：

posts = pd.read_csv('posts1.csv', header=1)

【讨论】：

我写了这个 posts_n = pd.read_csv('posts1.csv',encoding='latin-1',header=1) 还是一样的错误