半浮点数的 FLT_MAX答案

【问题标题】：FLT_MAX for half floats半浮点数的 FLT_MAX
【发布时间】：2021-02-28 02:30:01
【问题描述】：

我正在使用 CUDA with half floats 或 __half，因为它们在 CUDA 中被称为。

FLT_MAX 的半浮点数是多少？

cuda_fp16.h 标头似乎没有类似的宏。

$ grep MAX /usr/local/cuda-11.1/targets/x86_64-linux/include/cuda_fp16.h
$

【问题讨论】：

标签： cuda math.h half-precision-float

【解决方案1】：

我以前需要类似的宏（虽然不是在 CUDA 中），但在 this C++ fp16 proposal for short floats 中找到了一些常量。

“S”前缀来自建议的“short” in short float。

// Smallest positive short float
#define SFLT_MIN 5.96046448e-08
// Smallest positive
// normalized short float
#define SFLT_NRM_MIN 6.10351562e-05
// Largest positive short float
#define SFLT_MAX 65504.0
// Smallest positive e
// for which (1.0 + e) != (1.0)
#define SFLT_EPSILON 0.00097656
// Number of digits in mantissa
// (significand + hidden leading 1)
#define SFLT_MANT_DIG 11
// Number of base 10 digits that
// can be represented without change
#define SFLT_DIG 2
// Base of the exponent
#define SFLT_RADIX 2
// Minimum negative integer such that
// HALF_RADIX raised to the power of
// one less than that integer is a
// normalized short float
#define SFLT_MIN_EXP -13
// Maximum positive integer such that
// HALF_RADIX raised to the power of
// one less than that integer is a
// normalized short float
#define SFLT_MAX_EXP 16
// Minimum positive integer such
// that 10 raised to that power is
// a normalized short float
#define SFLT_MIN_10_EXP -4
// Maximum positive integer such
// that 10 raised to that power is
// a normalized short float
#define SFLT_MAX_10_EXP 4

您还可以从half.hpp library 中找到类似的常量。

注意：我不确定 CUDA 编译器在 fp16 文字方面支持什么。因此，您可能需要将这些转换为十六进制，将这些位重新解释为 __half（注意：注意转换/转换）。

这些都不是理想的，如果有人可以将您指向某个 cuda_fp16_limits.h 文件，那么请支持这个答案而不是这个答案。

【讨论】：