反转位数组中的位顺序答案

【问题标题】：Reverse the order of bits in a bit array反转位数组中的位顺序
【发布时间】：2016-02-27 21:10:43
【问题描述】：

我有一个长的位序列存储在一个无符号长整数数组中，像这样

struct bit_array
{
    int size; /* nr of bits */
    unsigned long *array; /* the container that stores bits */
}

我正在尝试设计一种算法来反转 *array 中的位顺序。问题：

size 可以是任何值，即不一定是 8 或 32 等的倍数，因此输入数组中的第一位可以在输出数组中 unsigned long 内的任何位置结束；
算法应该独立于平台，即适用于任何sizeof(unsigned long)。

代码、伪代码、算法描述等 - 任何比蛮力（“一点一点”）方法更好的方法都是受欢迎的。

【问题讨论】：

"[T]输入数组中的 first 位可以在输出数组的 unsigned long 内的任何位置结束"？我不确定我是否理解。第一位不会在第一个长的第一个位置吗？你不是说最后位吗？
我认为问题在于，如果位数组中有 57 位，则位号 0 需要与位号 56 交换。但是，在我们做任何事情之前，我们需要知道位是否数组的 0 存储在数组元素 0 的 MSB 或 LSB 中（或者，如果元素 0 不在位 0 的位置，我们需要了解位 0 的存储位置）。
@Jonathan 和 Eques：啊哈，这是关于颠倒顺序！我以为它只是在反转每一位。很抱歉造成误解。
为什么不在结构中添加额外的两个字段来定义要跳过的方向和位数？然后创建程序来访问它取决于方向？
你可以控制这个结构的定义吗？如果你这样做了，为什么要将位存储在unsigned long *，而不是uint8_t *。这将排除平台相关问题。

标签： c bitarray

【解决方案1】：

我喜欢查找表的想法。它仍然是 log(n) 组位技巧的典型任务，可能非常快。喜欢：

unsigned long reverseOne(unsigned long x) {
  x = ((x & 0xFFFFFFFF00000000) >> 32) | ((x & 0x00000000FFFFFFFF) << 32);
  x = ((x & 0xFFFF0000FFFF0000) >> 16) | ((x & 0x0000FFFF0000FFFF) << 16);
  x = ((x & 0xFF00FF00FF00FF00) >> 8)  | ((x & 0x00FF00FF00FF00FF) << 8);
  x = ((x & 0xF0F0F0F0F0F0F0F0) >> 4)  | ((x & 0x0F0F0F0F0F0F0F0F) << 4);
  x = ((x & 0xCCCCCCCCCCCCCCCC) >> 2)  | ((x & 0x3333333333333333) << 2);
  x = ((x & 0xAAAAAAAAAAAAAAAA) >> 1)  | ((x & 0x5555555555555555) << 1);
  return x;
}

基本思想是，当我们打算反转某个序列的顺序时，我们可以交换该序列的头和尾两半，然后分别反转每一半（这里通过对每一半递归应用相同的过程来完成)。

这是一个更便携的版本，支持 unsigned long 宽度为 4、8、16 或 32 字节。

#include <limits.h>

#define ones32 0xFFFFFFFFUL
#if (ULONG_MAX >> 128)
#define fill32(x) (x|(x<<32)|(x<<64)|(x<<96)|(x<<128)|(x<<160)|(x<<192)|(x<<224))
#define patt128 (ones32|(ones32<<32)|(ones32<<64) |(ones32<<96))
#define patt64  (ones32|(ones32<<32)|(ones32<<128)|(ones32<<160))
#define patt32  (ones32|(ones32<<64)|(ones32<<128)|(ones32<<192))
#else
#if (ULONG_MAX >> 64)
#define fill32(x) (x|(x<<32)|(x<<64)|(x<<96))
#define patt64  (ones32|(ones32<<32))
#define patt32  (ones32|(ones32<<64))
#else
#if (ULONG_MAX >> 32)
#define fill32(x) (x|(x<<32))
#define patt32  (ones32)
#else
#define fill32(x) (x)
#endif
#endif
#endif

unsigned long reverseOne(unsigned long x) {
#if (ULONG_MAX >> 32)
#if (ULONG_MAX >> 64)
#if (ULONG_MAX >> 128)
  x = ((x & ~patt128) >> 128) | ((x & patt128) << 128);
#endif
  x = ((x & ~patt64) >> 64) | ((x & patt64) << 64);
#endif
  x = ((x & ~patt32) >> 32) | ((x & patt32) << 32);
#endif
  x = ((x & fill32(0xffff0000UL)) >> 16) | ((x & fill32(0x0000ffffUL)) << 16);
  x = ((x & fill32(0xff00ff00UL)) >> 8)  | ((x & fill32(0x00ff00ffUL)) << 8);
  x = ((x & fill32(0xf0f0f0f0UL)) >> 4)  | ((x & fill32(0x0f0f0f0fUL)) << 4);
  x = ((x & fill32(0xccccccccUL)) >> 2)  | ((x & fill32(0x33333333UL)) << 2);
  x = ((x & fill32(0xaaaaaaaaUL)) >> 1)  | ((x & fill32(0x55555555UL)) << 1);
  return x;
}

【讨论】：

不错的一个。但unsigned long 的大小可能会有所不同。
此代码不可移植，并且有很大的潜在整数类型错误的可能性。您需要将 unsigned long 更改为 uint64_t 并使用 const uint64_t 变量而不是使用幻数确保所有整数文字的类型为 uint64_t。例如，文字 0x00000000FFFFFFFF 是 32 位无符号整数，而不是您可能假设的 64 位有符号整数，因为在文字的开头添加大量零不会使其成为 64 位类型。 >
现在想象一下，如果int 在给定系统上是 64 位的：突然会有这个文字的整数提升，突然它现在是一个有符号类型。为什么首先使用带符号的文字？在进行“位摆弄”操作时，您几乎从不需要签名类型。这里提出的算法很不错，但是像这样的 C 代码就是一颗定时炸弹。

【解决方案2】：

我会把问题分成两部分。

首先，我会忽略使用的位数不是 32 的倍数这一事实。我会使用给定的方法之一来像这样交换整个数组。

伪代码：

for half the longs in the array:
    take the first longword;
    take the last longword;
    swap the bits in the first longword
    swap the bits in the last longword;

    store the swapped first longword into the last location;
    store the swapped last longword into the first location;

然后修正前几位（比号码n 调用）实际上是长尾末尾的垃圾位的事实：

for all of the longs in the array:
    split the value in the leftmost n bits and the rest;
    store the leftmost n bits into the righthand part of the previous word;
    shift the rest bits to the left over n positions (making the rightmost n bits zero);
    store them back;

当然，您可以尝试将其折叠为一次遍历整个数组。像这样的：

for half the longs in the array:
    take the first longword;
    take the last longword;
    swap the bits in the first longword
    swap the bits in the last longword;

    split both value in the leftmost n bits and the rest;

    for the new first longword:
        store the leftmost n bits into the righthand side of the previous word;
        store the remaining bits into the first longword, shifted left;

    for the new last longword:
        remember the leftmost n bits for the next iteration;
        store the remembered leftmost n bits, combined with the remaining bits, into the last longword;

    store the swapped first longword into the last location;
    store the swapped last longword into the first location;

我从这里的边缘情况（第一个和最后一个长字）中抽象出来，您可能需要根据每个长字中位的排序方式来反转移位方向。

【讨论】：

【解决方案3】：

大小不是sizeof(long) 的倍数这一事实是问题中最难的部分。这可能会导致大量位移。

但是，如果可以引入新的结构成员，则不必这样做：

struct bit_array
{
    int size; /* nr of bits */
    int offset; /* First bit position */
    unsigned long *array; /* the container that stores bits */
}

Offset 会告诉您在数组的开头忽略多少位。

那么您只需要执行以下步骤：

反转数组元素。
交换每个元素的位。其他答案中有很多技巧，但您的编译器也可能提供内在函数来用更少的指令来完成它（例如某些 ARM 内核上的 RBIT 指令）。
计算新的起始偏移量。这等于最后一个元素的未使用位。

【讨论】：

为什么不用uint8_t* 代替数组？
@Lundin OP 表示大小也不是 8 的倍数，所以我认为这没有帮助。事实上，我认为数组元素的最佳数据类型应该根据目标架构上 CPU 寄存器的大小来选择。
即使类型为uint8_t，您仍然可以按“CPU 寄存器大小”的块访问数据。但是你不能做相反的事情，所以在小型 CPU 上 unsigned long 会非常繁重。
@Lundin 是的，我不认为 unsigned long 是最佳数据类型。但是在 32 位 CPU 上使用 uint8_t（例如），如果您想处理 4 字节块，您必须手动处理对齐并特别处理最后一个元素，如果数组大小不能被 4 整除。

【解决方案4】：

您必须定义unsigned long 中的位顺序。您可能会假设位 n 对应于array[x] & (1 << n)，但这需要指定。如果是这样，您需要处理字节顺序（小端或大端），如果您打算使用字节访问数组而不是无符号长。

我肯定会先实施蛮力并衡量速度是否有问题。如果在大型阵列上没有大量使用，则无需浪费时间尝试优化它。优化版本可能难以正确实施。如果您最终还是尝试了，可以使用蛮力版本来验证测试值的正确性并对优化版本的速度进行基准测试。

【讨论】：

【解决方案5】：

我最喜欢的解决方案是填充一个查找表，它在单个字节上进行位反转（因此有 256 个字节条目）。

您将表应用于输入操作数的 1 到 4 个字节，并带有交换。如果大小不是 8 的倍数，则需要最后右移进行调整。

这可以很好地扩展到更大的整数。

例子：

11 10010011 00001010 -> 01010000 11001001 11000000 -> 01 01000011 00100111

要将数字拆分为可移植的字节，您需要使用按位掩码/移位；将结构体或字节数组映射到整数可以提高效率。

对于蛮力性能，您可以考虑一次最多映射 16 位，但这看起来不太合理。

【讨论】：

这个问题似乎有点与效率有关（“任何比蛮力更好的东西”等），这个查找表真的会像对 long 的位操作一样有效吗？我真的不知道其中一种方法，但在我看到测量结果之前，我怀疑这可能会慢得多。
在数据结构中查找，与寄存器中的位操作相比，它充其量是在最高级别的缓存中。
@ThomasPadron-McCarthy：当然取决于目标。在 8 位 CPU 上，该表会快得多。因为它们通常没有用于任意计数的单周期移位器，并且无论如何都可以处理 8 位数据。对于 16 位，这取决于。对于 32 位，如果 CPU 有位反转指令，它可能会更慢，更慢。
@Olaf：不在我的台式电脑中……但是，是的，我需要多出去。
@erip：30 个周期对于 L1 缓存的延迟来说听起来非常悲观。即使这样，延迟也可能无关紧要。优化的环路可能主要受带宽限制。（甚至可能通过计算）

【解决方案6】：

在可以找到here 的相关主题集合中，单个数组条目的位可以如下反转。

unsigned int v;     // input bits to be reversed
unsigned int r = v; // r will be reversed bits of v; first get LSB of v
int s = sizeof(v) * CHAR_BIT - 1; // extra shift needed at end

for (v >>= 1; v; v >>= 1)
{   
  r <<= 1;
  r |= v & 1;
  s--;
}
r <<= s; // shift when v's highest bits are zero

之后可以通过重新排列各个位置来完成整个阵列的反转。

【讨论】：

变量名称“input”、“output”和“size”/“count”会更有意义。不要只是从该站点复制/粘贴解决方案，它因代码可读性差而臭名昭著。
感谢您的评论；基本上我只是复制和粘贴，以免引入错误。但是，也许自制的实现会更好。
最后的反转不是那么容易做到的。