从套接字的 byte[] 开头解析 int答案

【问题标题】：Parsing int from the start of a byte[] from a socket从套接字的 byte[] 开头解析 int
【发布时间】：2016-05-02 08:15:44
【问题描述】：

我有一个 Java 应用程序正在从接收不同大小的 XML 的 TCP 套接字读取数据。给定数据包的前 5 个字节应该指示剩余消息的大小。如果我手动创建一个大字节 [] 并读取数据，我可以成功读取消息和 xml。

以下是生成数据的应用程序手册中的说明：

每条消息前面都有消息大小指示符，它是使用网络字节顺序方法的 32 位无符号整数。为了例如：\x05\x00\x00\x00\x30\x31\x30\x32\x00 表示消息 5 个字节的 ack 的大小包括第五个消息字节“\0”。这大小指示符指定大小指示符之后的所有内容自己。

但是我不知道如何将前 5 个字节解码为一个整数，我可以使用该整数来正确调整字节 [] 的大小以读取消息的其余部分。我得到随机结果：

这是我用来解析消息的代码：

DataOutputStream out = new DataOutputStream(clientSocket.getOutputStream());
BufferedInputStream inFromServer = new BufferedInputStream(clientSocket.getInputStream());

byte[] data = new byte[10];
inFromServer.read(data);
String result = new String(data, "ISO-8859-1");

Logger.info(data+"");

//PROBLEM AREA: Tried reading different byte lengths but no joy
//This should be a number but it never is. Often strange symbols
byte[] numeric = Arrays.copyOfRange(data,1,5);
String numericString = new String(numeric, "ISO-8859-1");

//Create a huge array to make sure everything gets captured. 
//Want to use the parsed value from the start here
byte[] message = new byte[1000000];
inFromServer.read(message);

//This works as expected and returns correctly formatted XML
String fullMessage = new String(message, "ISO-8859-1");

Logger.info("Result "+result+ " Full message "+fullMessage);

【问题讨论】：

消息长度在前四个而不是五个字节
“网络字节顺序”看起来很像 little-endian，也称为 not 网络字节顺序。
说明不正确。这不是网络字节顺序中的 5。如果是，您可以使用DataInputStream.readInt()。事实上，您应该向供应商投诉（“寻求澄清”）。这不是 XML。

标签： java sockets tcp bytearray

【解决方案1】：

长度看起来是小端。您仍然可以使用 DataInputStream 但您必须交换字节。如果你使用 NIO 的 SocketChannel 和 ByteBuffer，你可以设置字节顺序，但这可能更难使用。

// only do this once per socket.
DataInputStream in = new DataInputStream(
                                  new BufferedInputStream(clientSocket.getInputStream()));

// for each message.
int len0 = in.readInt();
int len = Integer.reverseBytes(len0);
assert len < 1 << 24;

byte[] bytes = new byte[len];
in.readFully(bytes);

String text = new String(bytes, "ISO-8859-1").trim();
int number = Integer.parseInt(text);

【讨论】：

这是正确的，因为它与问题中发布的文档有关。但是，事实证明文档在另一方面是错误的。该示例在文档中是小端，但数据是大端。
@JoeW 在这种情况下，您可以删除reverseBytes

【解决方案2】：

网络字节顺序又名big-endian。但是看到你的数据，实际上使用的是小端。至少5 看起来像小端序中的前 4 个字节，而不是大端序中的前 4 个字节。因此，您需要读取这些字节，考虑小端并转换为长以考虑“无符号”。

public static void main(String[] args) throws IOException {
    DataInputStream inFromServer = new DataInputStream(new BufferedInputStream(null));

    int iSize = inFromServer.readInt();
    iSize = Integer.reverseBytes(iSize); //read as little-endian

    long count = Integer.toUnsignedLong(iSize); //unsigned int
}

【讨论】：

“网络字节顺序”在 RFC 中定义为大端序。