python pandas group by float range给出typeError [重复]答案

【问题标题】：python pandas group by float range gives typeError [duplicate]python pandas group by float range给出typeError [重复]
【发布时间】：2021-10-15 05:57:23
【问题描述】：

我正在将csv 文件读取到pandas，并希望将其分组并绘制为条形图。对于groupby 和pd.cut，我收到以下错误（我正在关注https://stackoverflow.com/a/48280774/2005559）：（实际的csv 有很多列，其中大部分是字符串，因此我不能read_csv astype('float') 如果这是来源）

dataset = pd.read_csv("res.csv")
print(dataset.groupby(['IF']).size())
dataset.groupby(
    pd.cut(dataset['IF'],
           bins=[1, 3, 5, 7, 9, np.inf],
           labels=["<1", "<3", "<5", "<7",
                   "<9"])).size().reset_index(name='count')

给出错误：

IF
0        23
0.29      1
0.4       7
0.51      1
0.516     1
         ..
9.02      2
9.16      1
9.227     1
9.3       1
9.567     2
Length: 299, dtype: int64
Traceback (most recent call last):
  File "/home/rudra/Projects/Indent/init.py", line 13, in <module>
    pd.cut(dataset['IF'],
  File "/usr/lib64/python3.9/site-packages/pandas/core/reshape/tile.py", line 273, in cut
    fac, bins = _bins_to_cuts(
  File "/usr/lib64/python3.9/site-packages/pandas/core/reshape/tile.py", line 407, in _bins_to_cuts
    ids = ensure_int64(bins.searchsorted(x, side=side))
TypeError: '<' not supported between instances of 'float' and 'str'

【问题讨论】：

"实际的 csv 有很多列，其中大部分是字符串，因此如果这是源，我不能 read_csv astype('float')" 当然它是源；错误消息告诉您它正在尝试进行涉及字符串的比较，因此您要比较的列之一必须包含字符串。确定您不能转换整个表格，但大概您考虑过转换该列？

标签： python pandas

【解决方案1】：

您是否尝试过仅将此列转换为数字数据类型？

dataset['IF'] = pd.to_numeric(dataset['IF'], errors='coerce')

【讨论】：