Pandas 在单元格中添加值时不返回正确的数字答案

【问题标题】：Pandas doesn't return the correct number while adding values in cellsPandas 在单元格中添加值时不返回正确的数字
【发布时间】：2019-11-13 09:46:53
【问题描述】：

我从https://www.kaggle.com/gustavomodelli/forest-fires-in-brazil获得了原始数据集

Acre_dataset 是原始数据集Acre_dataset的子数据集

我正在尝试总结“年份”列中不同值过滤的“数字”列中的值，如下面的截图

数据集 Acre_dataset 包含大约 300 行，“数字”列中值的精度都低于 0.001（这意味着我们不会有像 1.0001 这样的数字，但我们会有像 1.001 这样的数字）

截图中的代码：

Acre_firecount = [0] * len(year_ls)
print(type(Acre_dataset.iloc[0]['number']))
for i in range(len(Acre_dataset)):
    for j in range(len(year_ls)):
        if Acre_dataset.iloc[i]['year'] == year_ls[j]:
            Acre_firecount[j] += Acre_dataset.iloc[i]['number']
print(Acre_firecount)
type(Acre_firecount[12])

但我在这个列表中有两个罕见的数字，分别是 475.21299999999997 和 618.4300000000001。

我已经检查了Acre_dataset中“数字”列单元格中数字的数据类型和列表Acre_firecount中元素的数据类型（这是结果），它们都是numpy.float64。

为什么会出现这种问题，如何避免？

【问题讨论】：

请提供数据的副本，而不是数据的截图，这样更容易复制。
这是python 浮动表示系统中的一个已知问题（不是pandas 故障）。使用Acre_dataset['number'] = Acre_dataset['number'].round(3) 将其强制为 3 位精度。

标签： python pandas

【解决方案1】：

我在Python Documentation找到了解释

它说：

Python 只打印机器存储的二进制近似值的真实十进制值的十进制近似值

【讨论】：