需要帮助解压缩GIF栅格数据答案

【问题标题】：Need help decompressing GIF raster data需要帮助解压缩GIF栅格数据
【发布时间】：2014-10-10 07:34:38
【问题描述】：

我有一个 10x10 的 gif，由 4 种颜色组成，白色、红色、蓝色、黑色。我已经解析了下面的gif数据

4749 4638 3961                      <-- header

0a00 0a00 9100 00                   <-- lsd (pb 91 = 1001 0001) nColors = 4, bytes = 12

ffffff ff0000 0000ff 000000         <-- global color table
                                    #0  FF FF FF
                                    #1  FF 00 00
                                    #2  00 00 FF
                                    #3  00 00 00

21f9 04(00) (0000) (00)00           <-- graphics control extension 
                                    (00) = pb (000 reserved 000 disposal method 0 user input flag 0 transparent color flag) 
                                    (0000) = delay time                     
                                    (00) = transparent color index

2c 0000 0000 0a00 0a00 00           <-- image descriptor

02 16                               <-- (image data - 02 = lzw min code size, 0x16 size of image (bytes))
8c2d 9987 2a1c dc33 a002 75ec 95fa a8de 608c 0491 4c01 00   <-- image block
3b

好的，所以我们在上面有我们的图像数据（标记的图像块），我正在尝试对其进行解压缩，以便恢复原始图像。我的理解是，从 lzwmincodesize + 1 开始从左到右读取字节，从右到左读取位（一次 2 + 1 位 = 3 位）。

这是我在关注的解压算法

Initialize code table
let CODE be the first code in the code stream
output {CODE} to index stream
<LOOP POINT>
let CODE be the next code in the code stream
is CODE in the code table?
Yes:
    output {CODE} to index stream
    let K be the first index in {CODE}
    add {CODE-1}+K to the code table
No:
    let K be the first index of {CODE-1}
    output {CODE-1}+K to index stream
    add {CODE-1}+K to code table
return to LOOP POINT

我正在逐步完成解压缩算法，这是我目前所想出的......（从前 3 个字节码开始）

Global Color Table
000 FF FF FF
001 FF 00 00
010 00 00 FF
011 00 00 00
100 CLEAR
101 End of Data

 3  2   1   6  5   4      8   7    
10|001|100  0|010|110|1  100|110|01

last        current     output      cindex      exists      dictionary      value
            100                     4                                       CLEAR
100         001         001         1           y                           RED
001         110         001 001     6           n           +001 001        RED RED
001 001     110         001 001     6           y           +001 001 001    RED RED
001 001     010         010         2           y           +001 001 010    BLUE
010         010         010         2           y           +010 010        BLUE
010         110         001 001     6           y           +010 001 001    RED RED
001 001     100                     4                                       CLEAR
100         111         111         7 ???? <--- (what do I do here)?

我应该得到 5 个红色值，然后是前 10 个像素的 5 个蓝色值，但正如您所见，它解码了 5 个红色，然后是 2 个蓝色，然后是 2 个红色。谁能指出我在这里做错了什么？

谢谢

【问题讨论】：

标签： gif compression lzw

【解决方案1】：

您的错误来自于错过代码大小的增加。以下是代码、“nextcode”值和当前代码大小：

Code read from bitstream:    100, 001, 110, 110, 0010, 1001
internal 'Next' code value:  110, 110, 110, 111, 1000, 1001
current code size:             3,   3,   3,   3,    4,    4

解码循环中缺少的逻辑是您需要维护一个“nextcode”变量，该变量告诉您在表中的何处插入代码以及何时增加代码大小。它从值“clearcode + 2”开始，并在从比特流中读取每个代码后增加（在第一个非 CC 值之后）。逻辑基本上是这样的：

clear_dictionary:
clearcode = 1<<codestart;
codesize = codestart+1;
nextcode = clearcode + 2;
nextlimit = 1<<(codesize+1);
oldcode = -1;
mainloop:
while (!done)
{
code = getCode();
if (code == clearcode)
   goto clear_dictionary;
if (oldcode == -1)
{
   write_code_to_output(code);
}
else
{
   <LZW logic>
   nextcode++;
   if (nextcode >= nextlimit)
   {
      nextlimit <<= 1;
      codesize++;
   }
}
oldcode = code;
} // while !done

【讨论】：

你是如何维护下一个代码的？我的下一个码流是这样出来的？ 110 110 010 010 110 100 111
对于每个不是清晰代码且不是 cc 之后的第一个代码的代码，您递增“nextcode”变量。
好的，这是代码流 10001100 00101101 10011001 10000111。对于每个不是 cc 而不是 cc 之后的第一个代码的代码，下一个代码变量是如何为我排列的。读取代码：100 001 110 110 010 100（cc 读取，111 是下一个代码）下一个代码：110 110 010 010 110 111（你说要跳过）如果我尝试用我的算法遵循它，它的输出仍然不正确。
“下一个代码”总是递增；它是一个内部变量，而不是从比特流中读取的代码。