感谢 Abhijit Sarkar 带路的回答。
我需要下载大量 JSON 流并将其分解为可流式管理的小型数据片段。
JSON 由具有大属性的对象组成:这些大属性可以序列化为文件,从而从未编组的 JSON 对象中删除。
另一个用例是逐个对象下载 JSON 流对象,像 map/reduce 算法一样处理它并生成单个输出,而无需将整个流加载到内存中。
另一个用例是读取一个大的 JSON 文件,然后根据条件只选择几个对象,同时解组为普通的旧 Java 对象。
这里有一个例子:我们想流式传输一个非常大的 JSON 文件,它是一个数组,并且我们想只检索数组中的第一个对象。
鉴于服务器上的这个大文件,可在http://example.org/testings.json 获得:
[
{ "property1": "value1", "property2": "value2", "property3": "value3" },
{ "property1": "value1", "property2": "value2", "property3": "value3" },
... 1446481 objects => a file of 104 MB => take quite long to download...
]
这个 JSON 数组的每一行都可以解析为这个对象:
@lombok.Data
public class Testing {
String property1;
String property2;
String property3;
}
你需要这个类使解析代码可重用:
import com.fasterxml.jackson.core.JsonParser;
import java.io.IOException;
@FunctionalInterface
public interface JsonStreamer<R> {
/**
* Parse the given JSON stream, process it, and optionally return an object.<br>
* The returned object can represent a downsized parsed version of the stream, or the result of a map/reduce processing, or null...
*
* @param jsonParser the parser to use while streaming JSON for processing
* @return the optional result of the process (can be {@link Void} if processing returns nothing)
* @throws IOException on streaming problem (you are also strongly encouraged to throw HttpMessageNotReadableException on parsing error)
*/
R stream(JsonParser jsonParser) throws IOException;
}
还有这个类要解析:
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import lombok.AllArgsConstructor;
import org.springframework.http.HttpInputMessage;
import org.springframework.http.HttpOutputMessage;
import org.springframework.http.MediaType;
import org.springframework.http.converter.HttpMessageConverter;
import java.io.IOException;
import java.util.Collections;
import java.util.List;
@AllArgsConstructor
public class StreamingHttpMessageConverter<R> implements HttpMessageConverter<R> {
private final JsonFactory factory;
private final JsonStreamer<R> jsonStreamer;
@Override
public boolean canRead(Class<?> clazz, MediaType mediaType) {
return MediaType.APPLICATION_JSON.isCompatibleWith(mediaType);
}
@Override
public boolean canWrite(Class<?> clazz, MediaType mediaType) {
return false; // We only support reading from an InputStream
}
@Override
public List<MediaType> getSupportedMediaTypes() {
return Collections.singletonList(MediaType.APPLICATION_JSON);
}
@Override
public R read(Class<? extends R> clazz, HttpInputMessage inputMessage) throws IOException {
try (InputStream inputStream = inputMessage.getBody();
JsonParser parser = factory.createParser(inputStream)) {
return jsonStreamer.stream(parser);
}
}
@Override
public void write(R result, MediaType contentType, HttpOutputMessage outputMessage) {
throw new UnsupportedOperationException();
}
}
然后,这里是用于流式传输 HTTP 响应、解析 JSON 数组并仅返回第一个未编组对象的代码:
// You should @Autowire these:
JsonFactory jsonFactory = new JsonFactory();
ObjectMapper objectMapper = new ObjectMapper();
RestTemplateBuilder restTemplateBuilder = new RestTemplateBuilder();
// If detectRequestFactory true (default): HttpComponentsClientHttpRequestFactory will be used and it will consume the entire HTTP response, even if we close the stream early
// If detectRequestFactory false: SimpleClientHttpRequestFactory will be used and it will close the connection as soon as we ask it to
RestTemplate restTemplate = restTemplateBuilder.detectRequestFactory(false).messageConverters(
new StreamingHttpMessageConverter<>(jsonFactory, jsonParser -> {
// While you use a low-level JsonParser to not load everything in memory at once,
// you can still profit from smaller object mapping with the ObjectMapper
if (!jsonParser.isClosed() && jsonParser.nextToken() == JsonToken.START_ARRAY) {
if (!jsonParser.isClosed() && jsonParser.nextToken() == JsonToken.START_OBJECT) {
return objectMapper.readValue(jsonParser, Testing.class);
}
}
return null;
})
).build();
final Testing firstTesting = restTemplate.getForObject("http://example.org/testings.json", Testing.class);
log.debug("First testing object: {}", firstTesting);