尾随零的数量答案

【问题标题】：Number of trailing zeroes尾随零的数量
【发布时间】：2022-05-02 22:40:57
【问题描述】：

我编写了一个函数trailing_zeroes(int n)，它返回数字的二进制表示中尾随零的数量。

例子：4二进制是100，所以本例中的函数返回2。

unsigned trailing_zeroes(int n) {
    unsigned bits;

    bits = 0;
    while (n >= 0 && !(n & 01)) {
        ++bits;
        if (n != 0)
            n >>= 1;
        else
            break;
    }
    return bits;
}

if语句的原因是因为如果n等于0，就会出现循环。

我觉得这样写的代码很丑；有没有更好的办法？

我想避免在while 中使用break 语句，因为很多人告诉我在while/for 中使用该语句有时可能是“非正式的”。我想像这样重写函数，但我认为这不是最好的方法：

unsigned bits;
if (n == 0)
    return bits = 1;

bits = 0;
while (!(n & 01)) {
    ++bits;
    n >>= 1;
}

【问题讨论】：

试试__builtin_ctz。
最简单的方法可能是转换为std::string 并检查那个。
您的具体问题是什么？我们不是代码审查服务（作为旁注：代码对n 的某些值具有实现定义的行为）。
提示：避免使用 C 和 C++。虽然问题可能与两种语言都相关，但最佳答案可能是特定语言的。
我不知道为什么 GCC 中的__builtin_ctz 给出了一个未定义的结果为零，但是（guesswork）这可能是因为不同平台上的本机实现产生不同结果——这两个结果很可能是零和整数类型的位数。但 GCC 并没有使结果特定于平台，而是使其未定义，因此可移植代码不会使用零参数调用函数。

标签： c++ c while-loop bitwise-operators

【解决方案1】：

您的函数不正确：对于0，它仍然存在无限循环。测试应该是：

while (n > 0 && !(n & 1))

请注意，您无法使用这种方法处理负数，因此您的函数可能应该采用 unsigned 数字参数，或者您可以将参数转换为 unsigned。

你的函数应该是 0 的特殊情况并使用更简单的循环：

unsigned trailing_zeroes(int n) {
    unsigned bits = 0, x = n;

    if (x) {
        while ((x & 1) == 0) {
            ++bits;
            x >>= 1;
        }
    }
    return bits;
}

上面的函数非常简单易懂。如果结果很小，它是相当快的。 0 返回的值是 0，就像在您的函数中一样，这是值得怀疑的，因为 0 的尾随零实际上与 unsigned 类型中的值位一样多。

有一种更有效的方法，步骤数不变：

unsigned trailing_zeroes(int n) {
    unsigned bits = 0, x = n;

    if (x) {
        /* assuming `x` has 32 bits: lets count the low order 0 bits in batches */
        /* mask the 16 low order bits, add 16 and shift them out if they are all 0 */
        if (!(x & 0x0000FFFF)) { bits += 16; x >>= 16; }
        /* mask the 8 low order bits, add 8 and shift them out if they are all 0 */
        if (!(x & 0x000000FF)) { bits +=  8; x >>=  8; }
        /* mask the 4 low order bits, add 4 and shift them out if they are all 0 */
        if (!(x & 0x0000000F)) { bits +=  4; x >>=  4; }
        /* mask the 2 low order bits, add 2 and shift them out if they are all 0 */
        if (!(x & 0x00000003)) { bits +=  2; x >>=  2; }
        /* mask the low order bit and add 1 if it is 0 */
        bits += (x & 1) ^ 1;
    }
    return bits;
}

请注意，我们可以通过将第一步更改为来处理任何更大的int 大小

while (!(x & 0x0000FFFF)) { bits += 16; x >>= 16; }

一些编译器有一个内置函数__builtin_ctz()，可以使用非常高效的汇编代码计算尾随零的数量。它不是 C 标准函数，但以降低可移植性为代价，如果可用，您可能希望使用它。检查编译器的文档。

这是来自GCC docuemntation的摘要：

内置函数：int __builtin_ctz (unsigned int x)

返回 x 中尾随 0 位的数量，从最低有效位位置开始。如果x 是0，则结果未定义。

【讨论】：

unsigned bits = sizeof n * CHAR_BIT; ... if (x) { bits = 0; ... } 处理0 案件怎么样？（我想这完全取决于你如何定义trailing）我可以看到它是0 或sizeof n * CHAR_BIT，具体取决于定义。
@DavidC.Rankin：正如您从 gcc 文档中看到的那样，0 对于内置函数也必须是特殊的。
对于负操作数，> 的结果是实现定义的。它们为unsigned 定义良好。
@ClaudioPisa：对于常量值，很简单：我将在源代码中添加 cmets。
@ClaudioPisa：您可以通过点击分数下方的灰色复选标记来接受答案。

【解决方案2】：

正如已经提到的，有一个内置函数可以做到这一点，并且由于它可能使用硬件，它可能非常快。但是，doc for GCC 确实表示如果输入为 0，则结果未定义。由于这是一个扩展，它可能不适用于您的编译器。

否则，每当有人说“但操纵”或“位计数”时，您需要获取"Hacker's Delight" 的副本。一本好书，我买了两个版本。大约有 4 页（第 1 版）专门用于此，“ntz”（尾随零的数量）。如果您已经有一个“nlz”（前导零的数量）或一个“popcnt”函数，那么您可以直接获取 ntz。否则本书给出several implementations，有的使用popcnt，有的使用循环，有的使用二分查找。

例如，

int ntz3(unsigned x) {
   int n;

   if (x == 0) return(32);
   n = 1;
   if ((x & 0x0000FFFF) == 0) {n = n +16; x = x >>16;}
   if ((x & 0x000000FF) == 0) {n = n + 8; x = x >> 8;}
   if ((x & 0x0000000F) == 0) {n = n + 4; x = x >> 4;}
   if ((x & 0x00000003) == 0) {n = n + 2; x = x >> 2;}
   return n - (x & 1);
}

【讨论】：

请提供对定义此内置（或任何内置函数）的标准的参考。

【解决方案3】：

Henry Warren 在“Hacker's Delight”中报道了ntz 的各种方法。

我认为 De Bruijn 序列解决方案非常疯狂。见https://en.wikipedia.org/wiki/De_Bruijn_sequence#Finding_least-_or_most-significant_set_bit_in_a_word。

这是一个 64 位的实现，就像在国际象棋引擎中用来处理“位板”一样。

int ntz(uint64_t x) {
    // We return the number of trailing zeros in
    // the binary representation of x.
    //
    // We have that 0 <= x < 2^64.
    //
    // We begin by applying a function sensitive only
    // to the least significant bit (lsb) of x:
    //
    //   x -> x^(x-1)  e.g. 0b11001000 -> 0b00001111
    //
    // Observe that x^(x-1) == 2^(ntz(x)+1) - 1.

    uint64_t y = x^(x-1);

    // Next, we multiply by 0x03f79d71b4cb0a89,
    // and then roll off the first 58 bits.

    constexpr uint64_t debruijn = 0x03f79d71b4cb0a89;

    uint8_t z = (debruijn*y) >> 58;

    // What? Don't look at me like that.
    //
    // With 58 bits rolled off, only 6 bits remain,
    // so we must have one of 0, 1, 2, ..., 63.
    //
    // It turns out this number was judiciously
    // chosen to make it so each of the possible
    // values for y were mapped into distinct slots.
    //
    // So we just use a look-up table of all 64
    // possible answers, which have been precomputed in 
    // advance by the the sort of people who write
    // chess engines in their spare time:

    constexpr std::array<int,64> lookup = {
         0, 47,  1, 56, 48, 27,  2, 60,
        57, 49, 41, 37, 28, 16,  3, 61,
        54, 58, 35, 52, 50, 42, 21, 44,
        38, 32, 29, 23, 17, 11,  4, 62,
        46, 55, 26, 59, 40, 36, 15, 53,
        34, 51, 20, 43, 31, 22, 10, 45,
        25, 39, 14, 33, 19, 30,  9, 24,
        13, 18,  8, 12,  7,  6,  5, 63
    };

    return lookup[z];
}

【讨论】：