【问题标题】:How to use multiple link in .goto(url) puppeteer?如何在 .goto(url) puppeteer 中使用多个链接?
【发布时间】:2019-10-24 00:20:39
【问题描述】:
    const puppeteer = require("puppeteer");


(async () => {

    try {
        const browser = await puppeteer.launch({ headless: true});
        const page = await browser.newPage();

        await page.goto('url/c-0');
            await page.waitForSelector('.box-chap');
            const element = await page.$(".box-chap");
            const content = await page.evaluate(element => element.textContent, element);

            console.log(content + "chapter");

    } catch (error) {

    }
})();

大家好,目前我想循环然后: 网址/c-0' 网址/c-1' 网址/c-2' .....

请给我解决方案谢谢大家。

【问题讨论】:

    标签: node.js web-crawler puppeteer


    【解决方案1】:

    只是循环你的工作。您可以创建一个 forloop 来循环您要抓取的所有章节(如果您的章节网址具有相同的格式)。

    const puppeteer = require("puppeteer");
    
    
    (async () => {
    
      try {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
    
        const endOfChapterNumber = 10; // number of chapters
        for (const c = 0; c <= endOfChapterNumber; c++) {
          const chapterUrl = 'url/c-' + c;
          await page.goto(chapterUrl);
          await page.waitForSelector('.box-chap');
          const element = await page.$(".box-chap");
          const content = await page.evaluate(element => element.textContent, element);
    
          console.log(content + " chapter: " + c);
        }
      } catch (error) {
    
      }
    })();
    

    【讨论】:

      猜你喜欢
      • 2018-07-29
      • 2018-02-27
      • 1970-01-01
      • 1970-01-01
      • 2011-02-02
      • 2018-05-20
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多