Python double 使用 avro 模式失去精度答案

【问题标题】：Python double loses precision using avro schemaPython double 使用 avro 模式失去精度
【发布时间】：2021-10-26 15:44:18
【问题描述】：

我正在使用“Avro”模式序列化一些数据，代码是用 Python 编写的，我正面临精度丢失的问题。看起来 Python 正在对数字进行四舍五入并添加科学记数法。

我所看到的： 1.2345678901234568e+16

我期望看到的： 12345678901234567.19

代码示例如下。

可重现的代码示例：

from fastavro import writer, reader, parse_schema

schema = {
    'doc': 'A weather reading.',
    'name': 'Weather',
    'namespace': 'test',
    'type': 'record',
    'fields': [
        {'name': 'station', 'type': 'string'},
        {'name': 'time', 'type': 'double'},
        {'name': 'temp', 'type': 'double'},
    ],
}
parsed_schema = parse_schema(schema)

# 'records' can be an iterable (including generator)
records = [
    {u'station': u'011990-99999', u'temp': 0, u'time': 1433269388},
    {u'station': u'011990-99999', u'temp': -11, u'time': 12345678901234567.19},
    {u'station': u'012650-99999', u'temp': 111, u'time': 1433275478},
]

# Writing
with open('weather.avro', 'wb') as out:
    writer(out, parsed_schema, records)

# Reading
with open('weather.avro', 'rb') as fo:
    for record in reader(fo):
        print(record)

我相信可能有一种方法可以（覆盖）编写我自己的反序列化器，它可以让我控制如何将双精度反序列化为字符串。

有什么想法吗？

【问题讨论】：

您看到的是科学记数法。你试过扩大这个整数吗？
扩大整数究竟是什么意思？
stackoverflow.com/questions/658763/…
是的，我试过了，“扩展”只是抓取数字并以选定的格式显示，不处理精度问题：试试看：b = 123456789012345678.789 >>> b 1.2345678901234568e+17 >>> f'{b:20.5f}''123456789012345680.00000'
我认为这与 avro 没有任何关系。正如下面的答案所示，小数类型或 Python 的小数类对于精确值会更好

标签： python avro

【解决方案1】：

如果您想使用自定义逻辑类型，fastavro 支持：https://fastavro.readthedocs.io/en/latest/logical_types.html#custom-logical-types。当然，如果还使用了其他实现，那么他们将无法理解自定义逻辑类型。

但是，主要问题来自几乎所有语言中都存在的浮点数舍入。确保不进行舍入的更好选择可能是使用 Decimal 类型：https://avro.apache.org/docs/current/spec.html#Decimal

【讨论】：

不完全是我正在寻找的完整答案，但它确实有很大帮助，标记为解决方案。