【发布时间】:2012-03-19 01:31:57
【问题描述】:
你如何两次读取相同的输入流?有办法复制吗?
我需要从网络获取图像,将其保存在本地,然后返回保存的图像。我只是认为使用相同的流而不是为下载的内容启动一个新流然后再次读取它会更快。
【问题讨论】:
-
可能使用标记和重置
标签: java inputstream
你如何两次读取相同的输入流?有办法复制吗?
我需要从网络获取图像,将其保存在本地,然后返回保存的图像。我只是认为使用相同的流而不是为下载的内容启动一个新流然后再次读取它会更快。
【问题讨论】:
标签: java inputstream
根据 InputStream 的来源,您可能无法重置它。您可以使用markSupported() 检查是否支持mark() 和reset()。
如果是,可以在InputStream上调用reset()返回到开头。如果没有,则需要再次从源中读取 InputStream。
【讨论】:
InputStream BufferedInputStream 等子类确实支持“标记”
如果您正在使用InputStream 的实现,您可以检查InputStream#markSupported() 的结果,告诉您是否可以使用mark() / reset() 方法。
如果您可以在阅读时标记流,则调用reset() 以返回开始。
如果不能,则必须再次打开流。
另一种解决方案是将 InputStream 转换为字节数组,然后根据需要多次迭代数组。您可以在这篇文章Convert InputStream to byte array in Java 中找到几个解决方案,是否使用第三方库。注意,如果读取的内容太大,您可能会遇到一些内存问题。
最后,如果您需要读取图像,请使用:
BufferedImage image = ImageIO.read(new URL("http://www.example.com/images/toto.jpg"));
使用ImageIO#read(java.net.URL) 还允许您使用缓存。
【讨论】:
ImageIO#read(java.net.URL) 时的警告:一些网络服务器和CDN 可能会拒绝ImageIO#read 发出的裸呼叫(即没有使服务器相信呼叫来自网络浏览器的用户代理) .在这种情况下,使用 URLConnection.openConnection() 将用户代理设置为该连接 + 使用 `ImageIO.read(InputStream) 在大多数情况下会成功。
InputStream 不是接口
将输入流转换为字节,然后将其传递给保存文件函数,在该函数中将其组装到输入流中。 同样在原始函数中使用字节用于其他任务
【讨论】:
您可以使用org.apache.commons.io.IOUtils.copy 将 InputStream 的内容复制到字节数组中,然后使用 ByteArrayInputStream 从字节数组中重复读取。例如:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
org.apache.commons.io.IOUtils.copy(in, baos);
byte[] bytes = baos.toByteArray();
// either
while (needToReadAgain) {
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
yourReadMethodHere(bais);
}
// or
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
while (needToReadAgain) {
bais.reset();
yourReadMethodHere(bais);
}
【讨论】:
如果你的InputStream支持使用标记,那么你可以mark()你的inputStream然后reset()它。如果您的InputStrem 不支持标记,那么您可以使用java.io.BufferedInputStream 类,这样您就可以像这样将您的流嵌入到BufferedInputStream 中
InputStream bufferdInputStream = new BufferedInputStream(yourInputStream);
bufferdInputStream.mark(some_value);
//read your bufferdInputStream
bufferdInputStream.reset();
//read it again
【讨论】:
BufferedInputStream.fill(),有“增长缓冲区”部分,其中新缓冲区大小仅与marklimit 和MAX_BUFFER_SIZE 进行比较。
您可以使用 PushbackInputStream 包装输入流。 PushbackInputStream 允许 unread ("write back") 已经读取的字节,所以你可以这样做:
public class StreamTest {
public static void main(String[] args) throws IOException {
byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
InputStream originalStream = new ByteArrayInputStream(bytes);
byte[] readBytes = getBytes(originalStream, 3);
printBytes(readBytes); // prints: 1 2 3
readBytes = getBytes(originalStream, 3);
printBytes(readBytes); // prints: 4 5 6
// now let's wrap it with PushBackInputStream
originalStream = new ByteArrayInputStream(bytes);
InputStream wrappedStream = new PushbackInputStream(originalStream, 10); // 10 means that maximnum 10 characters can be "written back" to the stream
readBytes = getBytes(wrappedStream, 3);
printBytes(readBytes); // prints 1 2 3
((PushbackInputStream) wrappedStream).unread(readBytes, 0, readBytes.length);
readBytes = getBytes(wrappedStream, 3);
printBytes(readBytes); // prints 1 2 3
}
private static byte[] getBytes(InputStream is, int howManyBytes) throws IOException {
System.out.print("Reading stream: ");
byte[] buf = new byte[howManyBytes];
int next = 0;
for (int i = 0; i < howManyBytes; i++) {
next = is.read();
if (next > 0) {
buf[i] = (byte) next;
}
}
return buf;
}
private static void printBytes(byte[] buffer) throws IOException {
System.out.print("Reading stream: ");
for (int i = 0; i < buffer.length; i++) {
System.out.print(buffer[i] + " ");
}
System.out.println();
}
}
请注意,PushbackInputStream 存储字节的内部缓冲区,因此它确实在内存中创建了一个缓冲区来保存“写回”的字节。
了解了这种方法,我们可以更进一步,将其与 FilterInputStream 结合起来。 FilterInputStream 将原始输入流存储为委托。这允许创建允许自动“未读”原始数据的新类定义。该类的定义如下:
public class TryReadInputStream extends FilterInputStream {
private final int maxPushbackBufferSize;
/**
* Creates a <code>FilterInputStream</code>
* by assigning the argument <code>in</code>
* to the field <code>this.in</code> so as
* to remember it for later use.
*
* @param in the underlying input stream, or <code>null</code> if
* this instance is to be created without an underlying stream.
*/
public TryReadInputStream(InputStream in, int maxPushbackBufferSize) {
super(new PushbackInputStream(in, maxPushbackBufferSize));
this.maxPushbackBufferSize = maxPushbackBufferSize;
}
/**
* Reads from input stream the <code>length</code> of bytes to given buffer. The read bytes are still avilable
* in the stream
*
* @param buffer the destination buffer to which read the data
* @param offset the start offset in the destination <code>buffer</code>
* @aram length how many bytes to read from the stream to buff. Length needs to be less than
* <code>maxPushbackBufferSize</code> or IOException will be thrown
*
* @return number of bytes read
* @throws java.io.IOException in case length is
*/
public int tryRead(byte[] buffer, int offset, int length) throws IOException {
validateMaxLength(length);
// NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
// because read() guarantees to read a byte
int bytesRead = 0;
int nextByte = 0;
for (int i = 0; (i < length) && (nextByte >= 0); i++) {
nextByte = read();
if (nextByte >= 0) {
buffer[offset + bytesRead++] = (byte) nextByte;
}
}
if (bytesRead > 0) {
((PushbackInputStream) in).unread(buffer, offset, bytesRead);
}
return bytesRead;
}
public byte[] tryRead(int maxBytesToRead) throws IOException {
validateMaxLength(maxBytesToRead);
ByteArrayOutputStream baos = new ByteArrayOutputStream(); // as ByteArrayOutputStream to dynamically allocate internal bytes array instead of allocating possibly large buffer (if maxBytesToRead is large)
// NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
// because read() guarantees to read a byte
int nextByte = 0;
for (int i = 0; (i < maxBytesToRead) && (nextByte >= 0); i++) {
nextByte = read();
if (nextByte >= 0) {
baos.write((byte) nextByte);
}
}
byte[] buffer = baos.toByteArray();
if (buffer.length > 0) {
((PushbackInputStream) in).unread(buffer, 0, buffer.length);
}
return buffer;
}
private void validateMaxLength(int length) throws IOException {
if (length > maxPushbackBufferSize) {
throw new IOException(
"Trying to read more bytes than maxBytesToRead. Max bytes: " + maxPushbackBufferSize + ". Trying to read: " +
length);
}
}
}
这个类有两个方法。一种用于读取现有缓冲区(定义类似于调用 InputStream 类的public int read(byte b[], int off, int len))。第二个返回新缓冲区(如果要读取的缓冲区大小未知,这可能更有效)。
现在让我们看看我们的课程的实际效果:
public class StreamTest2 {
public static void main(String[] args) throws IOException {
byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
InputStream originalStream = new ByteArrayInputStream(bytes);
byte[] readBytes = getBytes(originalStream, 3);
printBytes(readBytes); // prints: 1 2 3
readBytes = getBytes(originalStream, 3);
printBytes(readBytes); // prints: 4 5 6
// now let's use our TryReadInputStream
originalStream = new ByteArrayInputStream(bytes);
InputStream wrappedStream = new TryReadInputStream(originalStream, 10);
readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // NOTE: no manual call to "unread"(!) because TryReadInputStream handles this internally
printBytes(readBytes); // prints 1 2 3
readBytes = ((TryReadInputStream) wrappedStream).tryRead(3);
printBytes(readBytes); // prints 1 2 3
readBytes = ((TryReadInputStream) wrappedStream).tryRead(3);
printBytes(readBytes); // prints 1 2 3
// we can also call normal read which will actually read the bytes without "writing them back"
readBytes = getBytes(wrappedStream, 3);
printBytes(readBytes); // prints 1 2 3
readBytes = getBytes(wrappedStream, 3);
printBytes(readBytes); // prints 4 5 6
readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // now we can try read next bytes
printBytes(readBytes); // prints 7 8 9
readBytes = ((TryReadInputStream) wrappedStream).tryRead(3);
printBytes(readBytes); // prints 7 8 9
}
}
【讨论】:
怎么样:
if (stream.markSupported() == false) {
// lets replace the stream object
ByteArrayOutputStream baos = new ByteArrayOutputStream();
IOUtils.copy(stream, baos);
stream.close();
stream = new ByteArrayInputStream(baos.toByteArray());
// now the stream should support 'mark' and 'reset'
}
【讨论】:
如果有人在 Spring Boot 应用程序中运行,并且您想读取 RestTemplate 的响应正文(这就是我想读取流两次的原因),那么有一种干净(er)的方法这个。
首先需要使用Spring的StreamUtils将流复制到String中:
String text = StreamUtils.copyToString(response.getBody(), Charset.defaultCharset()))
但这还不是全部。您还需要使用可以为您缓冲流的请求工厂,如下所示:
ClientHttpRequestFactory factory = new BufferingClientHttpRequestFactory(new SimpleClientHttpRequestFactory());
RestTemplate restTemplate = new RestTemplate(factory);
或者,如果您使用的是工厂 bean,那么(尽管如此,这是 Kotlin):
@Bean
@Scope(ConfigurableBeanFactory.SCOPE_PROTOTYPE)
fun createRestTemplate(): RestTemplate = RestTemplateBuilder()
.requestFactory { BufferingClientHttpRequestFactory(SimpleClientHttpRequestFactory()) }
.additionalInterceptors(loggingInterceptor)
.build()
【讨论】:
为了将InputStream一分为二,同时避免将所有数据加载到内存中,然后独立处理它们:
OutputStream,准确地说:PipedOutputStream
PipedInputStream 是返回的InputStream。OutputStream 连接。因此,从源代码InputStream 中读取的所有内容都将写入OutputStream。不需要实现它,因为它已经在 TeeInputStream (commons.io) 中完成。在一个单独的线程中读取整个源输入流,然后将输入数据隐式传输到目标输入流。
public static final List<InputStream> splitInputStream(InputStream input)
throws IOException
{
Objects.requireNonNull(input);
PipedOutputStream pipedOut01 = new PipedOutputStream();
PipedOutputStream pipedOut02 = new PipedOutputStream();
List<InputStream> inputStreamList = new ArrayList<>();
inputStreamList.add(new PipedInputStream(pipedOut01));
inputStreamList.add(new PipedInputStream(pipedOut02));
TeeOutputStream tout = new TeeOutputStream(pipedOut01, pipedOut02);
TeeInputStream tin = new TeeInputStream(input, tout, true);
Executors.newSingleThreadExecutor().submit(tin::readAllBytes);
return Collections.unmodifiableList(inputStreamList);
}
注意消费完后关闭inputStreams,关闭运行的线程:TeeInputStream.readAllBytes()
以防万一,您需要将其拆分为多个InputStream,而不仅仅是两个。将前面的代码片段中的TeeOutputStream 类替换为您自己的实现,这将封装List<OutputStream> 并覆盖OutputStream 接口:
public final class TeeListOutputStream extends OutputStream {
private final List<? extends OutputStream> branchList;
public TeeListOutputStream(final List<? extends OutputStream> branchList) {
Objects.requireNonNull(branchList);
this.branchList = branchList;
}
@Override
public synchronized void write(final int b) throws IOException {
for (OutputStream branch : branchList) {
branch.write(b);
}
}
@Override
public void flush() throws IOException {
for (OutputStream branch : branchList) {
branch.flush();
}
}
@Override
public void close() throws IOException {
for (OutputStream branch : branchList) {
branch.close();
}
}
}
【讨论】:
如果您使用 RestTemplate 进行 http 调用,只需添加一个拦截器。 响应体由 ClientHttpResponse 的实现缓存。 现在可以根据需要多次从 respose 中检索输入流
ClientHttpRequestInterceptor interceptor = new ClientHttpRequestInterceptor() {
@Override
public ClientHttpResponse intercept(HttpRequest request, byte[] body,
ClientHttpRequestExecution execution) throws IOException {
ClientHttpResponse response = execution.execute(request, body);
// additional work before returning response
return response
}
};
// Add the interceptor to RestTemplate Instance
restTemplate.getInterceptors().add(interceptor);
【讨论】:
ByteArrayInputStream ins = new ByteArrayInputStream("Hello".getBytes());
System.out.println("ins.available() at begining:: " + ins.available());
ins.mark(0);
// Read input stream for some operations
System.out.println("ins.available() after reading :: " + ins.available());
ins.reset();
System.out.println("ins.available() after resetting :: " + ins.available());
// ins is ready for reading once again.
【讨论】: