【发布时间】:2021-10-15 05:57:23
【问题描述】:
我正在将csv 文件读取到pandas,并希望将其分组并绘制为条形图。
对于groupby 和pd.cut,我收到以下错误(我正在关注https://stackoverflow.com/a/48280774/2005559):(实际的csv 有很多列,其中大部分是字符串,因此我不能read_csv astype('float') 如果这是来源)
dataset = pd.read_csv("res.csv")
print(dataset.groupby(['IF']).size())
dataset.groupby(
pd.cut(dataset['IF'],
bins=[1, 3, 5, 7, 9, np.inf],
labels=["<1", "<3", "<5", "<7",
"<9"])).size().reset_index(name='count')
给出错误:
IF
0 23
0.29 1
0.4 7
0.51 1
0.516 1
..
9.02 2
9.16 1
9.227 1
9.3 1
9.567 2
Length: 299, dtype: int64
Traceback (most recent call last):
File "/home/rudra/Projects/Indent/init.py", line 13, in <module>
pd.cut(dataset['IF'],
File "/usr/lib64/python3.9/site-packages/pandas/core/reshape/tile.py", line 273, in cut
fac, bins = _bins_to_cuts(
File "/usr/lib64/python3.9/site-packages/pandas/core/reshape/tile.py", line 407, in _bins_to_cuts
ids = ensure_int64(bins.searchsorted(x, side=side))
TypeError: '<' not supported between instances of 'float' and 'str'
【问题讨论】:
-
"实际的 csv 有很多列,其中大部分是字符串,因此如果这是源,我不能 read_csv astype('float')" 当然它是源;错误消息告诉您它正在尝试进行涉及字符串的比较,因此您要比较的列之一必须包含字符串。确定您不能转换整个表格,但大概您考虑过转换该列?