是否有任何高性能文件解析的设计模式？答案

【问题标题】：Are there any design patterns for high performance file parsing?是否有任何高性能文件解析的设计模式？
【发布时间】：2015-02-04 16:38:35
【问题描述】：

我最近开发了自己的文件解析类BufferedParseStream，并用它来解码PNG图像。我一直在将它的性能与开源项目PNGJ 进行比较，并且发现对于较小的图像尺寸，PNGJ 的速度可以达到我自己实现的两倍。我认为这与使用 BufferedInputStream 时的实现开销有关，因为 PNGJ 会使用自己的 equivalent。

是否有任何现有的设计模式可以将高性能文件解析为int、float 等原语？

public class BufferedParseStream extends BufferedInputStream {

private final ByteBuffer  mByteBuffer;

public BufferedParseStream(final InputStream pInputStream, final int pBufferSize) {
    super(pInputStream, pBufferSize);
    /* Initialize the ByteBuffer. */
    this.mByteBuffer  = DataUtils.delegateNative(new byte[8]);
}

private final void buffer(final int pNumBytes) throws IOException {
    /* Read the bytes into the ByteStorage. */
    this.read(this.getByteBuffer().array(), 0, pNumBytes);
    /* Reset the ByteBuffer Location. */
    this.getByteBuffer().position(0);
}

public final char parseChar() throws IOException {
    /* Read a single byte. */
    this.buffer(DataUtils.BYTES_PER_CHAR);
    /* Return the corresponding character. */
    return this.getByteBuffer().getChar();
}

public final int parseInt() throws IOException {
    /* Read four bytes. */
    this.buffer(DataUtils.BYTES_PER_INT);
    /* Return the corresponding integer. */
    return this.getByteBuffer().getInt();
}

public final long parseLong() throws IOException {
    /* Read eight bytes. */
    this.buffer(DataUtils.BYTES_PER_LONG);
    /* Return the corresponding long. */
    return this.getByteBuffer().getLong();
}

public final void setParseOrder(final ByteOrder pByteOrder) {
    this.getByteBuffer().order(pByteOrder);
}

private final ByteBuffer getByteBuffer() {
    return this.mByteBuffer;
}

}

【问题讨论】：

我认为在这种情况下我们不能谈论设计模式，而是算法优化，这在大多数情况下都是特定于算法的。尝试确定代码的哪一部分花费了太多时间并修复它
我明白，很抱歉混淆了术语。你觉得BufferedParseStream 有什么特别的缺陷吗？
不是特别清楚，但我对这些类和方法了解不多。这是您编写的代码的唯一部分吗？没有解码PNG图像的类吗？如果是你写的，这是最有可能效率低下的部分

标签： java file parsing binary

【解决方案1】：

Java nio 应该比使用输入流更快，您提供的类对我来说似乎很奇怪（可能只是我 :)），因为它在 ByteBuffer 之上有一个额外的层，我认为这不是必需的。

您应该直接使用字节缓冲区，它有一个 getInt、getFloat 方法，您可以直接将其输入到所需的变量中。

我认为尽管您的性能问题可能出在其他人已经提到的 PNG 解码器代码中。您应该将其发布以供进一步分析

【讨论】：

我没有意识到我增加了额外的复杂性！我将重构我的工作以消除冗余组件。就一般的PNG解码器而言，我提出了这个问题，因为我采用的其他流式传输方法，外部代码大致相同。这是由于在解析像素并将它们流式传输到缓冲区时可以使用的算法选择有限。谢谢你的建议。（也欢迎来到 SO！）