【问题标题】:How to efficiently transpose a 2D bit matrix如何有效地转置二维位矩阵
【发布时间】:2017-06-06 07:27:34
【问题描述】:

我一直在这个问题上磕磕绊绊(例如在this question)。给定一个原始整数类型数组形式的二维位矩阵/板/数组,例如long 的数组。为简单起见,我们可以假设一个方阵,例如,在具有 64 位 long 的平台上包含 64 个 long 值的数组。

x[i] for 0 <= i < 64 成为输入数组。为0 <= i <= 64 计算一个数组y[i],这样:

(x[i] >> j) & 1 == (y[j] >> i) & 1

这里x >> ix按位右移i位,&是按位与,x[i]是数组ix个位置的值。

如何最有效地实现将数组x映射到数组y的函数?

我主要是在寻找无损方法,使输入数组 x 保持不变。

实现语言

使用的编程语言应该有数组和整数类型的按位运算。许多语言都满足这些要求。 C/C++ 和 Java 解决方案看起来非常相似,所以让我们选择这些语言。

【问题讨论】:

  • 这听起来像是家庭作业或测试,虽然非常高级。我建议你自己回答;你的名声表明你足够先进。祝你好运!
  • @Paul 我可以向你保证,这不是一个家庭作业问题。这是我不时遇到的问题(请参阅添加的链接)。如果这里没有人回答,我可能会自己提交解决方案。我认为这个问题值得对 SO 进行问答。
  • 你不是说(x[i] >> j),所以右移?如果左移一个非零的量,& 1 不会给出有趣的结果?
  • 伙计们,请不要仅仅因为我知道如何写出好的问题/规范就认为这是家庭作业。这真是令人沮丧。

标签: java c arrays performance bit-manipulation


【解决方案1】:

这似乎是问题Bitwise transpose of 8 bytes 的概括。这个问题只是关于 8x8 转置,所以你问的有点不同。但是您的问题在Hacker's Delight 一书的第 7.3 节中也得到了解答(您可能能够在 Google 图书上看到 the relevant pages)。那里显示的代码显然源自Guy Steele

Hacker's Delight website 仅包含书中8x832x32 案例的源代码,但后者可以简单地概括为您的 64x64 案例:

#include <stdint.h>

void
transpose64(uint64_t a[64]) {
  int j, k;
  uint64_t m, t;

  for (j = 32, m = 0x00000000FFFFFFFF; j; j >>= 1, m ^= m << j) {
    for (k = 0; k < 64; k = ((k | j) + 1) & ~j) {
      t = (a[k] ^ (a[k | j] >> j)) & m;
      a[k] ^= t;
      a[k | j] ^= (t << j);
    }
  }
}

其工作方式是该函数连续交换较小的位块,从 32x32 块开始(不转置位这些块中),然后在那些 32x32 块中交换适当的16x16 块等。保存块大小的变量是j。因此,外循环有j连续取值32、16、8、4、2和1,这意味着外循环运行了六次。内部循环运行在 half 位的行上,变量k 中的给定位等于零的行。当j 是 32 时,这些是 0-31 行,当 j 是 16 时,这些是 0-15 和 32-47 行,等等。循环的内部部分一起运行 6*32 = 192 次。在这个内部部分内部发生的是掩码m 确定应该交换哪些位,在t 中计算异或或那些位,并且使用异或的位列表来更新位两个地方都合适。

这本书(和网站)也有这个代码的一个版本,其中这些循环都已展开,并且掩码 m 不是计算的,而是分配的。我想这取决于寄存器的数量和指令缓存的大小这是否是一种改进?

为了测试它是否有效,假设我们定义了一些位模式,比如:

uint64_t logo[] = {
0b0000000000000000000000000000000000000000000100000000000000000000,
0b0000000000000000000000000000000000000000011100000000000000000000,
0b0000000000000000000000000000000000000000111110000000000000000000,
0b0000000000000000000000000000000000000001111111000000000000000000,
0b0000000000000000000000000000000000000000111111100000000000000000,
0b0000000000000000000000000000000000000000111111100000000000000000,
0b0000000000000000000000000000000000000000011111110000000000000000,
0b0000000000000000000000000000000000000000001111111000000000000000,
0b0000000000000000000000000000000000000000001111111100000000000000,
0b0000000000000000000000000000000010000000000111111100000000000000,
0b0000000000000000000000000000000011100000000011111110000000000000,
0b0000000000000000000000000000000111110000000001111111000000000000,
0b0000000000000000000000000000001111111000000001111111100000000000,
0b0000000000000000000000000000011111111100000000111111100000000000,
0b0000000000000000000000000000001111111110000000011111110000000000,
0b0000000000000000000000000000000011111111100000001111111000000000,
0b0000000000000000000000000000000001111111110000001111111100000000,
0b0000000000000000000000000000000000111111111000000111111100000000,
0b0000000000000000000000000000000000011111111100000011111110000000,
0b0000000000000000000000000000000000001111111110000001111111000000,
0b0000000000000000000000000000000000000011111111100001111111100000,
0b0000000000000000000000001100000000000001111111110000111111100000,
0b0000000000000000000000001111000000000000111111111000011111110000,
0b0000000000000000000000011111110000000000011111111100001111100000,
0b0000000000000000000000011111111100000000001111111110001111000000,
0b0000000000000000000000111111111111000000000011111111100110000000,
0b0000000000000000000000011111111111110000000001111111110000000000,
0b0000000000000000000000000111111111111100000000111111111000000000,
0b0000000000000000000000000001111111111111100000011111110000000000,
0b0000000000000000000000000000011111111111111000001111100000000000,
0b0000000000000000000000000000000111111111111110000011000000000000,
0b0000000000000000000000000000000001111111111111100000000000000000,
0b0000000000000000000000000000000000001111111111111000000000000000,
0b0000000000000000000000000000000000000011111111111100000000000000,
0b0000000000000000000111000000000000000000111111111100000000000000,
0b0000000000000000000111111110000000000000001111111000000000000000,
0b0000000000000000000111111111111100000000000011111000000000000000,
0b0000000000000000000111111111111111110000000000110000000000000000,
0b0000000000000000001111111111111111111111100000000000000000000000,
0b0000000000000000001111111111111111111111111111000000000000000000,
0b0000000000000000000000011111111111111111111111100000000000000000,
0b0000001111110000000000000001111111111111111111100000111111000000,
0b0000001111110000000000000000000011111111111111100000111111000000,
0b0000001111110000000000000000000000000111111111100000111111000000,
0b0000001111110000000000000000000000000000001111000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
};

然后我们调用transpose32 函数并打印生成的位模式:

#include <stdio.h>

void
printbits(uint64_t a[64]) {
  int i, j;

  for (i = 0; i < 64; i++) {
    for (j = 63; j >= 0; j--)
      printf("%c", (a[i] >> j) & 1 ? '1' : '0');
    printf("\n");
  }
}

int
main() {
  transpose64(logo);
  printbits(logo);
  return 0;
}

然后将其作为输出:

0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000011000000011111100000111111
0000000000000000000000000000000000111111000000011111100000111111
0000000000000000000000000000000000111111000000011111100000111111
0000000000000000000000000000000000111111000000011111100000111111
0000000000000000000000000100000000011111000000011111100000111111
0000000000000000000000011110000000011111100000011111100000111111
0000000000000000000001111110000000011111100000011111100000111111
0000000000000000000001111111000000011111100000011111100000111111
0000000000000000000000111111000000011111100000011111100000111111
0000000000000000000000111111100000001111110000011111100000111111
0000000000000000000000011111100000001111110000011111100000111111
0000000000000100000000011111110000001111110000011111100000111111
0000000000001110000000001111110000001111110000011111100000111111
0000000000011110000000001111111000001111110000011111100000111111
0000000001111111000000000111111000000111111000011111100000111111
0000000000111111100000000111111100000111111000011111100000111111
0000000000111111110000000011111100000111111000011111100000111111
0000000000011111111000000011111100000111111000011111100000111111
0000000000001111111100000001111110000011111000011111100000111111
0000000000000111111100000001111110000011111100011111100000111111
0000000000000011111110000000111111000011111100011111100000111111
0001000000000001111111000000111111000011111100011111100000111111
0011110000000001111111100000111111100011111100011111100000111111
0111111000000000111111110000011111100001111100011111100000111111
0111111110000000011111111000011111110001111110011111100000111111
1111111111000000001111111000001111110001111110011111100000111111
0011111111100000000111111100001111111001111110011111100000111111
0001111111111000000011111110000111111001111110011111100000111111
0000111111111100000011111111000111111100111100000000000000111111
0000001111111110000001111111100011111100000000000000000000111111
0000000111111111100000111111110011111000000000000000000000111111
0000000011111111110000011111110001100000000000000000000000111111
0000000000111111111000001111111000000000000000000000000000111111
0000000000011111111110000111111000000000000000000000000000111111
0000000000001111111111000111110000000000011111111111111111111111
0000000000000011111111100011100000000000011111111111111111111111
0000000000000001111111111001000000000000011111111111111111111111
0000000000000000111111111100000000000000011111111111111111111111
0000000000000000001111111100000000000000011111111111111111111111
0000000000000000000111111000000000000000011111111111111111111111
0000000000000000000011110000000000000000000000000000000000000000
0000000000000000000000100000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000

正如我们所希望的那样,这很好地翻转了。

编辑:

这实际上并不是您真正要求的,因为您要求的是此代码的非破坏性版本。您可以通过将 32x32 块的第一次交换从 x 转到 y 来实现这一点。例如,您可能会执行以下操作:

void
non_destructive_transpose64(uint64_t x[64], uint64_t y[64]) {
  int j, k;
  uint64_t m, t;

  for (k = 0; k < 64; k += 2) {
    ((uint32_t *) y)[k] = ((uint32_t *) x)[k ^ 64 + 1];
    ((uint32_t *) y)[k + 1] = ((uint32_t *) x)[k + 1];
  }
  for (; k < 128; k += 2) {
    ((uint32_t *) y)[k] = ((uint32_t *) x)[k];
    ((uint32_t *) y)[k + 1] = ((uint32_t *) x)[k ^ 64];
  }
  for (j = 16, m = 0x0000FFFF0000FFFF; j; j >>= 1, m ^= m << j) {
    for (k = 0; k < 64; k = ((k | j) + 1) & ~j) {
      t = (y[k] ^ (y[k | j] >> j)) & m;
      y[k] ^= t;
      y[k | j] ^= (t << j);
    }
  }
}

与其他版本的代码不同,无论架构的字节序如何,它都工作。另外,我知道 C 标准不允许您将 uint64_t 的数组作为 uint32_t 的数组来访问。但是,我喜欢这样操作时,移动块循环的第一次迭代不需要移位或异或。

【讨论】:

  • 展示一个实现会更好,尽管这确实很有帮助。
  • 不要回答链接到另一个答案。投票结束问题作为其他问题的重复。
  • @ziggystar:你投了反对票吗?无论如何,this source code 链接包含您要求的实现。由于我没有编写该代码,因此我也不想将其放在这里。
  • @Andreas:你给了我反对票吗?无论如何,我是新来的,我不知道该怎么做,请指出我是如何工作的文档。这不是真的重复的问题,因为另一个问题大约是 8x8,而这个大约是 64x64。
  • 你给了我否决票吗” 不是你投反对票,而是答案。我觉得这是一个很大的区别。
【解决方案2】:

在 C++ 中,8x8 矩阵是这样的,但您可以轻松更改它以使其更通用(不仅仅是 8x8)。 另外,我在 main 中包含了 1 个测试向量,只是为了感受一下:

#include <iostream>
#include <string>
#include <vector>

std::vector<long> rotate(std::vector<long>& v) {
    std::vector<long> temp = { 0,0,0,0,0,0,0,0 };
    for (unsigned int i = 0; i<8; i++) {
        int number = v[i];
        for (unsigned int j = 0; j<8; j++) {
            int z = (number & (1 << (7-j)));
            if (z != 0) {
                temp[j] |= (1 << (7 - i));
            }
        }
    }
    return temp;
}



int main()
{
    std::vector<long> v = { 0, 1, 2, 3, 4, 5, 6, 7 };
    std::vector<long> rotated = rotate(v);
    for (unsigned int i = 0; i<8; i++) {
        std::cout << rotated.at(i) << " ";
    }
    return 0;
}

因此,如果您需要它用于 Java,您可以轻松翻译它,因为 Java 也提供位运算符。

【讨论】:

    猜你喜欢
    • 2014-05-14
    • 2012-01-22
    • 2014-11-29
    • 2011-07-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多