为什么 Clang std::ostream 会写入 std::istream 无法读取的双精度值？答案

【问题标题】：Why does Clang std::ostream write a double that std::istream can't read?为什么 Clang std::ostream 会写入 std::istream 无法读取的双精度值？
【发布时间】：2019-02-23 22:15:56
【问题描述】：

我正在使用一个应用程序，该应用程序使用std::stringstream 从文本文件中读取由空格分隔的doubles 矩阵。该应用程序使用的代码有点像：

std::ifstream file {"data.dat"};
const auto header = read_header(file);
const auto num_columns = header.size();
std::string line;
while (std::getline(file, line)) {
    std::istringstream ss {line}; 
    double val;
    std::size_t tokens {0};
    while (ss >> val) {
        // do stuff
        ++tokens;
    }
    if (tokens < num_columns) throw std::runtime_error {"Bad data matrix..."};
}

相当标准的东西。我勤奋地编写了一些代码来制作数据矩阵（data.dat），对每条数据线使用以下方法：

void write_line(const std::vector<double>& data, std::ostream& out)
{
    std::copy(std::cbegin(data), std::prev(std::cend(data)),
              std::ostream_iterator<T> {out, " "});
    out << data.back() << '\n';
}

即使用std::ostream。但是，我发现应用程序无法使用此方法读取我的数据文件（抛出上述异常），尤其是无法读取7.0552574226130007e-321。

我编写了以下显示行为的最小测试用例：

// iostream_test.cpp

#include <iostream>
#include <string>
#include <sstream>

int main()
{
    constexpr double x {1e-320};
    std::ostringstream oss {};
    oss << x;
    const auto str_x = oss.str();
    std::istringstream iss {str_x};
    double y;
    if (iss >> y) {
        std::cout << y << std::endl;
    } else {
        std::cout << "Nope" << std::endl;
    }
}

我在 LLVM 10.0.0 (clang-1000.11.45.2) 上测试了这段代码：

$ clang++ --version
Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: x86_64-apple-darwin17.7.0 
$ clang++ -std=c++14 -o iostream_test iostream_test.cpp
$ ./iostream_test
Nope

我也尝试使用 Clang 6.0.1、6.0.0、5.0.1、5.0.0、4.0.1 和 4.0.0 进行编译，但得到了相同的结果。

使用 GCC 8.2.0 编译，代码可以正常工作：

$ g++-8 -std=c++14 -o iostream_test iostream_test.cpp
$ ./iostream_test.cpp
9.99989e-321

为什么 Clang 和 GCC 之间有区别？这是一个clang bug吗，如果不是，应该如何使用C++流来编写可移植的浮点IO？

【问题讨论】：

不相关，但变量名称 iss 用于输出和 oss 用于输入是一个奇怪且令人困惑的选择。
fyi 1e-320 低于正常范围：en.cppreference.com/w/cpp/language/type
@ShafikYaghmour 好吧1e-320 显然在[-DBL_MAX, DBL_MAX] 范围内，即使不能准确表示。
我举报了libc++ bug。

标签： c++ gcc floating-point clang iostream

【解决方案1】：

我相信 clang 在这里是一致的，如果我们阅读std::stod throws out_of_range error for a string that should be valid 的答案，它会说：

C++ 标准允许将字符串转换为 double 以报告下溢，如果结果在低于正常范围内，即使它是可表示的。

7.63918•10^-313 在double 的范围内，但在亚正常范围内。 C++ 标准说stod 调用strtod，然后按照C 标准定义strtod。 C 标准指出strtod 可能下溢，对此它说“如果数学结果的大小非常小，以至于在指定类型的对象中无法在没有异常舍入误差的情况下表示数学结果，则结果会下溢。 ”这是一个尴尬的措辞，但它指的是遇到低于正常值时发生的舍入错误。（低于正常值的相对误差比正常值大，因此它们的舍入误差可以说是异常的。）

因此，C++ 标准允许 C++ 实现下溢，即使它们是可表示的。

我们可以确认我们依赖strtod from [facet.num.get.virtuals]p3.3.4：

对于双精度值，函数 strtod。

我们可以用这个小程序进行测试（现场观看）：

void check(const char* p) 
{
  std::string str{p};
 
    printf( "errno before: %d\n", errno ) ;
    double val = std::strtod(str.c_str(), nullptr);
    printf( "val: %g\n", val ) ;
    printf( "errno after: %d\n", errno ) ;
    printf( "ERANGE value: %d\n", ERANGE ) ;
 
}

int main()
{
 check("9.99989e-321") ;
}

结果如下：

errno before: 0
val: 9.99989e-321
errno after: 34
ERANGE value: 34

7.22.1.3p10 中的 C11 告诉我们：

函数返回转换后的值（如果有）。如果无法执行转换，则返回零。如果正确的值溢出并且默认舍入生效（7.12.1），则返回正负 HUGE_VAL、HUGE_VALF 或 HUGE_VALL（根据值的返回类型和符号），并存储宏 ERANGE 的值在错误中。 如果结果下溢（7.12.1），函数返回一个值，其大小不大于返回类型中最小的归一化正数； errno 是否获取值 ERANGE 是实现定义的。

POSIX 使用that convention:

[ERANGE]
返回的值会导致上溢或下溢。

我们可以通过fpclassify (see it live) 验证它是否正常。

【讨论】：

@Daniel：这是一个错误。这并不违反 C++ 标准，但它是糟糕的设计。它应该被报告为一个错误，输入流无法读取输出流是提供错误报告的一个很好的例子，它应该被修复。
strtod 可能在此输入上报告 ERANGE 并没有说明是否允许 do_get 设置 failbit。