【发布时间】:2019-08-30 07:24:47
【问题描述】:
我有相当大的(30Gb gzipped,~300Gb gunzipped)gzip 压缩 rdf 文件,我需要逐行处理并将 gzip 压缩回另一个文件。所以这就是我目前拥有的(file 测试是 ~150Mb gzipped)
const fs = require('fs');
const zlib = require('zlib');
const readline = require('readline');
const readStream = fs.createReadStream('21million.rdf.gz').pipe(zlib.createGunzip());
const writeStream = fs.createWriteStream("21million.rdf");
const gzipStream = zlib.createGzip();
gzipStream.pipe(writeStream);
const rl = readline.createInterface({
input: readStream,
output: gzipStream,
});
rl.on('line', (line) => {
gzipStream.write(`${line.toUpperCase()}\n`);
});
rl.on('close', () => {
console.log('done');
gzipStream.end();
});
问题在于我收到FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory的这种流程
所以问题是 - 我该如何设置它才不会遇到 OOM 问题?
PS。我知道它可以用 sed、awk、pert 等来完成,但我需要在 js 中完成。
【问题讨论】:
标签: javascript node.js file-io zlib