【问题标题】:How do I remove this symbol from the CSV file that is generated by Puppeteer?如何从 Puppeteer 生成的 CSV 文件中删除此符号?
【发布时间】:2020-05-01 14:26:06
【问题描述】:

当我生成 CSV 文件时,每个“项目”输出都带有一个 Â 符号。我将如何用我的代码删除它。我试图将其更改为 utf-8,因为我读到这可能是导致它的原因。有任何想法吗?示例:

const products = await page.$$('.item-row'); 

Promise.all(products.map(async product => {
// Inside of each product find product SKU, it's human-readable name, and it's price
let productId = await product.$eval(".custom-body-copy", el => el.innerText.trim().replace(/,/g,' -').replace('Item  ', ''));
let productName = await product.$eval(".body-copy-link", el => el.innerText.trim().replace(/,/g,' -'));
let productPrice = await product.$eval(".product_desc_txt div span", el => el.innerText.trim().replace(/,/g,' -'));

// Format them as a csv line
return productId + ',' + productName + ',' + productPrice + ',';
})).then(lines => {
// Write the lines to a file
fs.writeFileSync("products.csv", lines.join('\n'), 'utf-8');
browser.close();
});
});

【问题讨论】:

    标签: node.js csv web-scraping export-to-csv puppeteer


    【解决方案1】:

    可能有更好的解决方案,但首先想到的是用split() 将字符串更改为数组,然后通过.map 测试Ã ascii 代码并改回字符串join() 像这样:

    const strToChange = 'My string with an à char';
    
    const charWeDoNotWant = 'Ã'.charCodeAt();
    
    const toFixArr = strToChange.split('');
    
    fixedArr = toFixArr.map(char =>
      char.charCodeAt() === charWeDoNotWant ? '' : char
    );
    
    const fixedStr = fixedArr.join('');
    console.log(`String without Ã: ${fixedStr}`);
    

    【讨论】:

    • 这就是我所做的,我认为空格导致出现一个奇怪的字符:let productId = await product.$eval(".custom-body-copy", el => el.innerText. trim().replace(/,/g,' -').replace('Item', '').replace(/\s+/g, ''));
    猜你喜欢
    • 1970-01-01
    • 2014-02-24
    • 2021-09-10
    • 2020-02-05
    • 2022-01-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-05-05
    相关资源
    最近更新 更多