【问题标题】:Counting number of lines in a transform stream. Node js计算转换流中的行数。节点js
【发布时间】:2016-08-15 06:56:08
【问题描述】:

我有一个 test.log 文件,在我的 IDE 中显示它有 2842000 行文本。 我有一个转换流,我想根据有关行数和其他信息的信息吐出一个对象。当我控制台记录我的总行数时,它高于 2846596 行。我从哪里获得额外的数据?

var fs = require('fs');
var util = require('util');
const Transform = require('stream').Transform;


//init timer and transform stream
var TransformStream = function(){
  Transform.call(this, {objectMode: true});

  //adding timer to TransformStream prototype;
  this.timer = process.hrtime();

  // adding a buffer to the transform stream
  this.buffer = new Buffer('');
}
util.inherits(TransformStream, Transform);//inheriting Transform into TransformStream

//total lines and bytes init
var sumBytes = 0;
var totalLines = 0;

//_transform function that needs to be defined in a transform stream.
TransformStream.prototype._transform = function(chunk, encoding, callback){
  // transform before
  // console.log("Transform:" + chunk);
  this.buffer = new Buffer(chunk);

  // transforming here
  // getting total number of lines
  var lines = (this.buffer.toString().split('\n').length);
  console.log(typeof(lines));
  console.log(totalLines + " plus " + lines);
  totalLines += lines;

  console.log(totalLines);
  // summing total length
  sumBytes += (this.buffer.length);
  //transform after 
  // console.log(this.buffer);
  var time = process.hrtime(this.timer);
  // pushing chunk out of transform
  var summaryObj = {elapsed_time: time, total_length_in_bytes: totalLines, total_lines: sumBytes}
  // console.log(summaryObj);
  // this.push(chunk);
  this.push(summaryObj);
  callback();
}


ts = new TransformStream;

ts.on('data', function(data){
  // sumBytes += data.length;
  console.log("This is the total amount of bytes in test.log: " + sumBytes);


  // console.log("This are the number lines in the chunk: " + readableData.length)


  // console.log(string);
})

ts.on('end', function(chunk){
  // WHY THE F IS CHUNK UNDEFINED HERE ON THE END EVENT!!!!!!!!!
  // clearing start
  // ending timer on end emitter for transformation
  // totalLines += string.split("\n").length;
  // console.log(totalLines);

  // console.log(time);
  // console.log(totalLines);
  console.log('we are in the end event for the transform stream ' + chunk);
})


rs = fs.createReadStream('test.log');
ws = fs.createWriteStream('transform.log');
rs.pipe(ts);

这是我在终端运行时得到的输出的结尾:

2840564
This is the total amount of bytes in test.log: NaN
number
2840564 plus 619
2841183
This is the total amount of bytes in test.log: NaN
number
2841183 plus 620
2841803
This is the total amount of bytes in test.log: NaN
number
2841803 plus 619
2842422
This is the total amount of bytes in test.log: NaN
number
2842422 plus 619
2843041
This is the total amount of bytes in test.log: NaN
number
2843041 plus 619
2843660
This is the total amount of bytes in test.log: NaN
number
2843660 plus 620
2844280
This is the total amount of bytes in test.log: NaN
number
2844280 plus 619
2844899
This is the total amount of bytes in test.log: NaN
number
2844899 plus 619
2845518
This is the total amount of bytes in test.log: NaN
number
2845518 plus 620
2846138
This is the total amount of bytes in test.log: NaN
number
2846138 plus 458
2846596
This is the total amount of bytes in test.log: NaN
we are in the end event for the transform stream undefined

【问题讨论】:

  • this.buffer 还是ArrayBuffer
  • 嗯...我正在浏览节点文档。我是否必须 _flush 转换流?这就是我获得额外数据的原因吗?
  • 如果this.bufferUint8Array,而不是TypedArray 中的实际文本数据,那么this.buffer.toString().split('\n').length 的用途是什么?
  • 那行是统计文件的字节数。我相信 buffer.length 会转换为字节。

标签: javascript node.js stream transform


【解决方案1】:

部分问题是this.buffer.toStrin‌​g().split('\n').lengt‌​h,如果bufferUint8Array,则不会返回this.buffer 的字节数,而是.split("\n") 返回的结果数组的.length

var buffer = new Uint8Array(100);
var lines = buffer.toString().split("\n");
console.log(`lines:${lines}\nlines.length:${lines.length}\nbuffer.byteLength:${buffer.byteLength}`);

您可以使用buffer.byteLength返回ArrayBuffer中设置的字节数。

【讨论】:

  • 我相信你可以通过拆分缓冲区并计算得到的array.length来计算字节数;我得到了与 bytelength 和拆分和计数数组相同的输出。我想使用 bytelength 可能更具成本效益。
猜你喜欢
  • 1970-01-01
  • 2014-12-06
  • 2021-11-19
  • 2016-02-18
  • 2016-05-16
  • 2017-06-10
  • 1970-01-01
  • 2015-11-19
  • 2011-09-12
相关资源
最近更新 更多