【问题标题】:Streaming data to gzip file crashes with "JavaScript heap out of memory" (NodeJS)将数据流式传输到 gzip 文件会因“JavaScript 堆内存不足”(NodeJS)而崩溃
【发布时间】:2019-11-21 23:49:48
【问题描述】:

我有这个小脚本可以以 gzip 的形式将一堆文本数据从源转储到磁盘。我从工作中提取的大多数资源都没有问题,但我遇到了一个抛出 JavaScript heap out of memory 的资源。

这是它正在做的事情的快照

const fs = require('fs');
const zlib = require('zlib');

const file = fs.createWriteStream('file.gz');
const gzip = zlib.createGzip();
gzip.pipe(file);

// ... code to connect to someDataSource would be here

someDataSource.on('data', (line) => { // feeding lines of text
    gzip.write(line);
});

someDataSource.on('done', () => {
    // crashes before this point
    gzip.end();
});

我怀疑zlib 模块在刷新到磁盘之前缓冲的方式比它应该的要多。在崩溃时,gz 文件只有大约4MB 大。就像我在上面所说的那样,我从工作中提取的其他数据源,所有这些都产生了远远超过 50MB 的 gz 文件。

模块上的文档在这里:https://nodejs.org/api/zlib.html#zlib_class_options

我不确定如何调整选项以使其正常运行。

崩溃:

<--- Last few GCs --->

[33692:0x10264e000]    97556 ms: Scavenge 1370.6 (1411.7) -> 1363.3 (1412.2) MB, 4.5 / 0.0 ms  (average mu = 0.174, current mu = 0.137) allocation failure 
[33692:0x10264e000]    97569 ms: Scavenge 1371.0 (1412.2) -> 1363.7 (1413.7) MB, 4.5 / 0.0 ms  (average mu = 0.174, current mu = 0.137) allocation failure 
[33692:0x10264e000]    97582 ms: Scavenge 1371.3 (1413.7) -> 1364.0 (1430.2) MB, 4.5 / 0.0 ms  (average mu = 0.174, current mu = 0.137) allocation failure 


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0xdd88f3dbe3d]
Security context: 0x32b80cc1e6e9 <JSObject>
    1: /* anonymous */(aka /* anonymous */) [0x32b897904941] [/some/path/node_modules/tedious/lib/token/stream-parser.js:~154] [pc=0xdd88f6fbec4](this=0x32b8101826f1 <undefined>)
    2: valueParse(aka valueParse) [0x32b8c73a8ab9] [/some/path/node_modules/tedious/lib/value-parser.js:~74] [pc=0xdd88f6c96d3](this=0x32b8101826f1 ...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0x10003c597 node::Abort() [/usr/local/bin/node]
 2: 0x10003c7a1 node::OnFatalError(char const*, char const*) [/usr/local/bin/node]
 3: 0x1001ad575 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 4: 0x100579242 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node]
 5: 0x10057bd15 v8::internal::Heap::CheckIneffectiveMarkCompact(unsigned long, double) [/usr/local/bin/node]
 6: 0x100577bbf v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/usr/local/bin/node]
 7: 0x100575d94 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
 8: 0x100574998 v8::internal::Heap::HandleGCRequest() [/usr/local/bin/node]
 9: 0x10052a1c8 v8::internal::StackGuard::HandleInterrupts() [/usr/local/bin/node]
10: 0x1007d9bb1 v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/bin/node]
11: 0xdd88f3dbe3d 
12: 0xdd88f6fbec4 
13: 0xdd88f6c96d3 
14: 0xdd88f6c8870 
[1]    33692 abort      node app.js

【问题讨论】:

    标签: node.js gzip zlib


    【解决方案1】:

    添加一个排水事件监听器。因为将数据写入引导程序是一种同步行为。

    someDataSource.on('data', (line) => { // feeding lines of text
        const ok = gzip.write(line);
        if(!ok) {
            someDataSource.pause();
        }
    });
    gzip.on('drain', () => {
        someDataSource.resume();
    });
    
    someDataSource.on('done', () => {
        // crashes before this point
        gzip.end();
    });
    

    或者直接使用pipe方法。

    someDataSource.pipe(gzip).pipe(file);
    

    【讨论】:

    • 做到了!幸好我的someDataSource 有一个暂停方法,我正在使用node-mssqlon('row')
    【解决方案2】:

    你也可以尝试增加分配给 Node.js 的内存:

    node --max-old-space-size=8192 your_script.js
    

    【讨论】:

    • 这可能行得通,但它有局限性,最终会因更大的数据集而崩溃。 @lx1412 提供的内容可以无限扩展。
    猜你喜欢
    • 2018-01-13
    • 2022-01-08
    • 2013-04-18
    • 2014-03-29
    • 2016-07-26
    • 2021-10-04
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多