如何提高读取 InputStream 的性能？答案

【问题标题】：How can I increase performance on reading the InputStream?如何提高读取 InputStream 的性能？
【发布时间】：2011-09-26 07:54:41
【问题描述】：

这很可能只是一个 KISS 时刻，但我觉得我还是应该问一下。

我有一个线程，它正在从套接字 InputStream 中读取数据。由于我正在处理特别小的数据大小（因为我可以期望接收的数据是 100 - 200 字节的顺序），我将缓冲区数组大小设置为 256。作为我的读取功能的一部分，我有一个检查这将确保当我从 InputStream 中读取所有数据时。如果我没有，那么我将再次递归调用 read 函数。对于每个递归调用，我将两个缓冲区数组重新合并在一起。

我的问题是，虽然我从没想过会使用超过 256 的缓冲区，但我想保证安全。但是如果绵羊开始飞行并且缓冲区明显更多，则读取函数（根据估计）将开始花费更多时间来完成指数曲线。

如何提高读取功能和/或缓冲区合并的效率？

这里是读取功能。

int BUFFER_AMOUNT = 256;

private int read(byte[] buffer) throws IOException {
   int bytes = mInStream.read(buffer); // Read the input stream

   if (bytes == -1) { // If bytes == -1 then we didn't get all of the data

        byte[] newBuffer = new byte[BUFFER_AMOUNT]; // Try to get the rest
        int newBytes;
        newBytes = read(newBuffer); // Recurse until we have all the data

        byte[] oldBuffer = new byte[bytes + newBytes]; // make the final array size

        // Merge buffer into the begining of old buffer.
        // We do this so that once the method finishes, we can just add the 
        // modified buffer to a queue later in the class for processing.
        for (int i = 0; i < bytes; i++) 
            oldBuffer[i] = buffer[i];

        for (int i = bytes; i < bytes + newBytes; i++) // Merge newBuffer into the latter half of old Buffer
            oldBuffer[i] = newBuffer[i];
        // Used for the recursion

        buffer = oldBuffer; // And now we set buffer to the new buffer full of all the data.
        return bytes + newBytes;
    }
    return bytes;
}

编辑：我是不是偏执狂（不合理），应该将缓冲区设置为 2048 并称之为完成？

【问题讨论】：

标签： java sockets merge bytearray bytebuffer

【解决方案1】：

int BUFFER_AMOUNT = 256;

如果您不希望它在运行时更改，则应该是最终的。

if (bytes == -1) {

应该是！=

另外，我并不完全清楚您要使用此代码完成什么。你介意解释一下吗？

【讨论】：

【解决方案2】：

我不知道您所说的“小数据量”是什么意思。您应该衡量时间是在内核模式（那么您直接在套接字上发出太多reads）还是在用户模式（那么您的算法太复杂）。

在前一种情况下，只需使用带有 4096 字节缓冲区的BufferedInputStream 包装输入并从中读取。

在后一种情况下，只需使用以下代码：

/**
  * Reads as much as possible from the stream.
  * @return The number of bytes read into the buffer, or -1
  *         if nothing has been read because the end of file has been reached.
  */
static int readGreedily(InputStream is, byte[] buf, int start, int len) {
  int nread;
  int ptr = start; // index at which the data is put into the buffer
  int rest = len; // number of bytes that we still want to read

  while ((nread = is.read(buf, ptr, rest)) > 0) {
    ptr += nread;
    rest -= nread;
  }

  int totalRead = len - rest;
  return (nread == -1 && totalRead == 0) ? -1 : totalRead;
}

这段代码完全避免了创建新对象、调用不必要的方法，而且——它很简单。

【讨论】：

感谢您的回答。我知道这里发生了什么，但是你在用 buf 做什么？看起来它只是在一段时间内的每次迭代中都被过度使用了。那样的话，数据不会变成任何东西。
在这种情况下，请阅读read 函数的文档。我这里没有覆盖任何字节，我只是继续读入缓冲区。

【解决方案3】：

BufferedInputStream，如 Roland 所述，DataInputStream.readFully()，替换所有循环代码。

【讨论】：

谢谢，这正是我需要的。