【问题标题】:GZIP Decompressing String Magic Number ExceptionGZIP 解压字符串幻数异常
【发布时间】:2016-03-16 10:06:05
【问题描述】:

我正在尝试解压缩字符串,但在解压缩时总是得到幻数异常。首先我压缩一个字符串,然后它得到base64编码,然后解码和解压缩。代码按调用顺序。

这是一个基于android的项目,没有外部依赖。

编码和解码工作正常。

是否有人注意到此代码中的错误并告诉我如何解决它?:

//Compressing a String
public String compress(String s) {
  if(s == null || s.length() == 0) { return string; }

  try {
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    GZIPOutputStream gzip = new GZIPOutputStream(out);
    gzip.write(s.getBytes("UTF-8"));
    gzip.close();
    return out.toString("UTF-8");
  } catch (Exception e) {
    e.printStackTrace();
  }
}
//Encode to Base64
public String encodeBase64(String s) {
  return Base64.encodeToString(s.getBytes("UTF-8"), Base64.NO_WRAP);
}

//Decode Base64
public String decodeBase64(String s) {
  return new String(Base64.decode(s, Base64.NO_WRAP), "UTF-8");
}

public String decompress(String s) {
  if(s == null || s.length() == 0) {
    return s;
  }
  byte[] ba = s.getBytes("UTF-8");
  byte[] buffer = new byte[1024];

  try {
    ByteArrayOutputStream out = new ByteArrayOutputStream(ba.length);
    ByteArrayInputStream in = new ByteArrayInputStream(ba);
    GZIPInputStream gzip = new GZIPInputStream(in); // Magic Number Exception occures here
    int len;
    while((len = gzip.read(buffer)) > 0) {
      out.write(buffer,0 ,len);
    }
    gzip.close();
    out.close();
    return out.toString("UTF-8");
  } catch (Exception e) {
    e.printStackTrace;
  } 
}

更新:

我用方法和调用方法创建了一个测试类,应该可以在 android 项目中使用:

import android.util.Base64;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.UnsupportedEncodingException;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

public class CompressTest {

    public static final String message = "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.   \n" +
            "\n" +
            "Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat.   \n" +
            "\n" +
            "Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.   \n" +
            "\n" +
            "Nam liber tempor cum soluta nobis eleifend option congue nihil imperdiet doming id quod mazim placerat facer possim assum. Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.   \n" +
            "\n" +
            "Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis.   \n" +
            "\n" +
            "At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, At accusam aliquyam diam diam dolore dolores duo eirmod eos erat, et nonumy sed tempor et et invidunt justo labore Stet clita ea et gubergren, kasd magna no rebum. sanctus sea sed takimata ut vero voluptua. est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat.   \n" +
            "\n" +
            "Consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus.   \n" +
            "\n" +
            "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.   \n" +
            "\n" +
            "Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat.   \n" +
            "\n" +
            "Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.   \n" +
            "\n" +
            "Nam liber tempor cum soluta nobis eleifend option congue nihil imperdiet doming id quod mazim placerat facer possim assum. Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo";

    //Compressing a String
    public static String compress(String s) {
        if(s == null || s.length() == 0) { return s; }

        try {
            ByteArrayOutputStream out = new ByteArrayOutputStream();
            GZIPOutputStream gzip = new GZIPOutputStream(out);
            gzip.write(s.getBytes("UTF-8"));
            gzip.close();
            return out.toString("UTF-8");
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }
    //Encode to Base64
    public static String encodeBase64(String s) throws UnsupportedEncodingException {
        return Base64.encodeToString(s.getBytes("UTF-8"), Base64.NO_WRAP);
    }

    //Decode Base64
    public static String decodeBase64(String s) throws UnsupportedEncodingException {
        return new String(Base64.decode(s, Base64.NO_WRAP), "UTF-8");
    }

    public static String decompress(String s) throws UnsupportedEncodingException {
        if(s == null || s.length() == 0) {
            return s;
        }
        byte[] ba = s.getBytes("UTF-8");
        byte[] buffer = new byte[1024];

        try {
            ByteArrayOutputStream out = new ByteArrayOutputStream(ba.length);
            ByteArrayInputStream in = new ByteArrayInputStream(ba);
            GZIPInputStream gzip = new GZIPInputStream(in); // Magic Number Exception occures here
            int len;
            while((len = gzip.read(buffer)) > 0) {
                out.write(buffer,0 ,len);
            }
            gzip.close();
            out.close();
            return out.toString("UTF-8");
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }
}

这里是调用:

try {
            String s = CompressTest.message;
            System.out.println("-----------------Message is:-----------------");
            s = CompressTest.compress(s);
            System.out.println("-----------------Compression is:-----------------");
            System.out.println(s);

            s = CompressTest.encodeBase64(s);
            System.out.println("-----------------Base64 Encode is:-----------------");
            System.out.println(s);

            s = CompressTest.decodeBase64(s);
            System.out.println("-----------------Base64 Decode is:-----------------");
            System.out.println(s);

            s = CompressTest.decompress(s);
            System.out.println("-----------------decompressed is:-----------------");
            System.out.println(s);
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }

此外,我用上面测试类中的数据进行了测试。

消息:见上面的代码 压缩: ����������������Mn�0��9����vtS���cq�L���8MN�7�\�A[��"��- Y$.o.h.TI.c.D.R.i'N.74.R.K7.c..n.I..(.l.Dk*..h>j. ��:E    %>d&��h�X@��%�؍�t��(����y���N_��B�ʲ��\��,m� 3D����L�l'�P%o����:?hbG[t9Se1��6������-��)# {P�2N��m�u5�%A�$ϐ�EN�@�y�r��o�d�[���"���mlVrM��G�9����p<.>�/�O8�/� C ��i=v_Q���4���2��������Q�����

编码:

H++/vQgAAAAAAAAA77+9TW7vv70wEO+/ve+/vTnvv70c77+977+9He+/vXYZdFPvv70AY3Hvv71M77+9H++/ve+/vThNTu+/vTfvv71c77+9QVvvv73vv70iDu+/ve+/vS1ZJO+/vW/vv71o77+9VEnvv71j77+9RO+/ve+/vVLvv71pJ07vv703NO+/vdykSzfvv73vv70BYwbvv70H77+977+977 +9bu+/vUnvv73vv73vv70o77+9bO+/vURrKu+/ve+/ve+/vRHvv71oPmrvv73vv73vv706Rd6hCEnvv70LCCU+ZCbvv73vv71o77+9WEAq77+977+9Je+/vdiN77+9dO+/ve+/vSjvv73vv73vv73vv715GO+/ve+/ve+/vU5f77+977+9Qu+/vcqy77+977+9XO+/ve+/ve+ /vSxt77+9M0Tvv70A77+9TA/vv70CHWwn77+9UCVv77+9Fu+/ve+/vTo/aGI8b++/ve+/vW7vv73vv70677+977+977+977+9D++/vVZOK++/vX/vv71EdHPvv73RtBEbLEJXJO+/vTDvv71qcDTvv73vv71M77+9Q++/ve+/vWnvv705Whzvv71zFx/vv73vv73vv70177+9LxHvv71UZu+/ve+ /ve+/ve+/vX0177+977+9VBbvv73vv73vv71iB++/vU7vv71iZO+/ve+/vVHvv73vv70/Qe+/vWcqCS7vv71aJywlKO+/ve+/vSFra++/ve+/vdGUdu+/vXPvv73vv73vv73vv700QVscMXcUfHnvv70aKUjvv73vv71e77+9YHDvv70+FyUQ77+9SlhfK++/ve+/vQzvv70d77+977+977+977+9c3heZwfvv73Juu+/ve+ /vWltU++/vTQPF++/vUHvv71fRmdJzpTvv70+77+90qMv77+977+977+977+9YCfvv70wJe+/vSNuOW3vv71FQ++/vdarBe+/vW9SB0UU77+97 7+9TG4uDe+/ve+/vRDvv70Makfvv73vv70F77+977+9PkdbdDlTZR0x77+977+9Nu+/vQTvv73vv73vv73vv70t77+9Ge+/vSkj3pJP77+9VFTvv73vv73vv73vv70BYu+/vXnvv70Y77+977+9YR3vv73vv717UO+/vTJO77+977+9be+/ve+/ve+/vR51Ne+/vSVB77+9DCTPkAbvv71FThLvv71A77+9GHnvv71y77+977+ 9bxrvv71k77+9W++/ve+/ve+/vSLvv73vv73vv71tbFZyTRvvv73vv71H77+9Od2e77+977+9cDzvv70/ZiInF++/vWgBOe+/ve+/ve+/ve+/ve+/ve+/ve+/vRkt77+977+9bz9j77+977+9LO+/vS/vv73vv70q77+9L++/vU8877+ 9BgDvv73vv73vv70E77+977+9TDVOdGzvv705J++/vSRP77+977+9YQ3vv701Zu+/ve+/vT4477+9L++/vQpjDe+/ve+/vWk9dl9R77+977+977+9NO+/ve+/ve+/vTLvv70e77+977+977+977+977+9He+/ve+/vVEY77+ 9FwAA

解码:

����������������Mn�0��9����vtS���cq�L���8MN�7�\�A[��” ��-Y$�o�h�TI�c�D��R�i'N�74�R�K7��c����n�I���(�l�Dk*����h >j����:E ��%>d&��h�X@��%������t��(����y��N_��B�ʲ��\��� ,m�3D����L�l'�P%o����:?hbG[t9Se1��6������-��)# a��{P�2N��m�u5�%A�$ϐ�EN�@�y�r��o�d�[���"���mlVrM��G�9���� p�/�O8�/� C ��i=v_Q���4���2��������Q�����

这里它并不真正可见,但我比较了编码前的压缩和解码后的压缩,它看起来像这样:

开头的一个字符缺少其他一切都很好。所以这可能是数据头的一部分,其中包含 gzip 格式的信息。我稍后会尝试手动添加并发布结果


解决方案

对于我如何在这里更改代码感兴趣的每个人,这里是 Teemu Ilmonen 建议的方法的 bytearray 版本:

//Compressing a String
    public static byte[] compress(String s) {
        if(s == null || s.length() == 0) { return null; }

        try {
            ByteArrayOutputStream out = new ByteArrayOutputStream();
            GZIPOutputStream gzip = new GZIPOutputStream(out);
            gzip.write(s.getBytes("UTF-8"));
            gzip.close();
            return out.toByteArray();
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }
    //Encode to Base64
    public static String encodeBase64(byte[]  array) throws UnsupportedEncodingException {
        return Base64.encodeToString(array, Base64.NO_WRAP);
    }

    //Decode Base64
    public static byte[] decodeBase64(String s) throws UnsupportedEncodingException {
        return Base64.decode(s, Base64.NO_WRAP);
    }

    public static String decompress(byte[] array) throws UnsupportedEncodingException {
        if(array == null || array.length == 0) {
            return null;
        }
        byte[] buffer = new byte[1024];

        try {
            ByteArrayOutputStream out = new ByteArrayOutputStream(array.length);
            ByteArrayInputStream in = new ByteArrayInputStream(array);
            GZIPInputStream gzip = new GZIPInputStream(in); // Magic Number Exception occures here
            int len;
            while((len = gzip.read(buffer)) > 0) {
                out.write(buffer,0 ,len);
            }
            gzip.close();
            out.close();
            return out.toString("UTF-8");
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }

【问题讨论】:

  • 能否提供ba 失败的内容?
  • 添加了测试类和操作结果。在比较压缩和解码字符串时注意到一些差异

标签: java android gzip compression


【解决方案1】:

一旦你用 gzip 压缩了一些东西,你应该不要再把它转换成字符串。你可能在乱码那里的内容。您应该使用 byte[] 代替:

return out.toByteArray();

【讨论】:

  • 如果我只使用 bytearray 并进行 base64 编码和解码,我真的会再次得到相同的 bytearray。所以这行得通,但你的回答并没有告诉我为什么,所以我测试了一些结果,如问题中所示。缺少字符串的第一个字符。这应该是 GZIP 幻数:0x8b1f 但我不知道它是否只是与测试字符串的巧合,或者是否总是发生。如果您对此有更多了解,可以为其他人更新您的答案。
【解决方案2】:

你需要按这个顺序执行:

  1. 将字符串转换为 Base64
  2. 使用 GZip 压缩 Base64 字符串
  3. 解压缩 GZip 流。
  4. 将 Base64 解码为纯文本。

您收到异常是因为您尝试解压缩 Base64 字符串而不是 GZip 流

【讨论】:

  • 我选择接受 Teemu 的答案,因为它更准确地解决了我的代码问题。因为从字符串更改为字节数组,我可以坚持执行顺序。 Base64 可以正确地将字节数组编码为字符串,甚至可以在没有失败的情况下反转该过程。如果想通过 web 服务发送数据,这可能会更有趣。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2017-10-10
  • 1970-01-01
  • 1970-01-01
  • 2011-04-07
  • 2011-08-18
  • 1970-01-01
相关资源
最近更新 更多