【问题标题】:Count Duplicate Lines from File using node.js使用 node.js 计算文件中的重复行
【发布时间】:2021-12-28 17:20:47
【问题描述】:

我必须逐行读取一个大的 .csv 文件,然后从一个国家/地区的文件中取出第一列并计算重复项。 例如,如果文件包含:

USA
UK
USA

输出应该是:

USA - 2
UK -1

代码:

const fs = require('fs')
const readline = require('readline')

const file = readline.createInterface({
    input: fs.createReadStream('file.csv'),
    output: process.stdout,
    terminal: false
})

file.on('line', line => {
    const country = line.split(",", 1)
    const number = ??? // don't know how to check duplicates
    const result = country + number

    if(lineCount >= 1 && country != `""`) {
        console.log(result)
    }
    lineCount++
})

【问题讨论】:

  • 将每个推入一个数组并执行 .includes

标签: javascript node.js stream fs readline


【解决方案1】:

因此,对于初学者来说,Array.prototype.split 返回一个数组,当您将其拆分为一个时,您似乎想要数组中的第一个值,因为您将其限制为一个。你可以在这里阅读:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split

接下来,您可以创建所有国家/地区的地图,并存储他们被看到的次数,然后在文件读取完成后记录结果


const countries = {}
let lineCount = 0
file.on('line', line => {
    // Destructure the array and grab the first value
    const [country] = line.split(",", 1)
    // Calling trim on the country should remove outer white space
    if (lineCount >= 1 && country.trim() !== "") {
        // If the country is not in the map, then store it
        if (!countries[country]) {
            countries[country] = 1
        } else {
            countries[country]++
        }
    }
    lineCount++
})

// Add another event listener for when the file has finished being read
// You may access the country data here, since this callback function
// won't be called till the file has been read
// https://nodejs.org/api/readline.html#event-close
file.on('close', () => {
    for (const country in countries) {
        console.log(`${country} - ${countries[country]}`)
    }
})

【讨论】:

    猜你喜欢
    • 2012-09-09
    • 1970-01-01
    • 2020-12-05
    • 2016-05-24
    • 2019-04-04
    • 2019-08-12
    • 2019-08-30
    • 2016-03-08
    • 1970-01-01
    相关资源
    最近更新 更多