【问题标题】:How to scrape Google Trends URL?如何抓取谷歌趋势 URL?
【发布时间】:2022-11-02 19:23:29
【问题描述】:

我在 Node JS 的帮助下抓取 Google Trends URL,但每次它都返回一个 429 错误代码,但在邮递员上工作正常,标题与我传入的代码相同。

这是我的代码:

const unirest = require("unirest")

const getData = async() => {


    let url = "https://trends.google.com/trends/api/explore?tz=420&req=%7B%22comparisonItem%22%3A%5B%7B%22keyword%22%3A%22audi%22%2C%22geo%22%3A%22%22%2C%22time%22%3A%22today+12-m%22%7D%2C%7B%22keyword%22%3A%22mercedes%22%2C%22geo%22%3A%22%22%2C%22time%22%3A%22today+12-m%22%7D%2C%7B%22keyword%22%3A%22bmw%22%2C%22geo%22%3A%22%22%2C%22time%22%3A%22today+12-m%22%7D%5D%2C%22category%22%3A0%2C%22property%22%3A%22%22%7D"

    const response = await unirest
    .get(url)
    .headers({
        "User-Agent":
        "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36",
    })
    
    console.log(response.body)

}
getData();

【问题讨论】:

    标签: node.js web-scraping unirest


    【解决方案1】:

    谷歌是一个很难抓取的平台。除了它的机器人预防系统外,它还经常使用 A/B 测试,这会改变布局并需要对网络爬虫进行额外调整。作为WebScrapingAPI 的工程师,我可以向您推荐我们的Google Trends Scraper。以下是它的工作原理:

    const axios=require('axios');
    
    const API_KEY = '<YOUR_API_KEY>'
    const QUERY = 'test'
    
    const SCRAPER = `https://api.searchdata.io/v1?engine=google_trends&api_key=${API_KEY}&q=${encodeURI(QUERY)}`
    
    const scrape = async () => {
      try {
        let response = await axios.get(SCRAPER)
        console.log(response.data)
      } catch (e) {
        console.log(e)
      }
    }
    
    scrape()
    

    或者,您可以使用 Puppeteer 呈现页面,但您可能会被阻止。这是一个脚本:

    const puppeteer = require("puppeteer")
    const cheerio=require('cheerio');
    
    const main = async () => {
        const browser = await puppeteer.launch({
            headless: false,
            defaultViewport: null,
            acceptInsecureCerts: true,
        })
        const page = await browser.newPage()
        await page.goto('https://trends.google.com/trends/api/explore?tz=420&req=%7B%22comparisonItem%22%3A%5B%7B%22keyword%22%3A%22audi%22%2C%22geo%22%3A%22%22%2C%22time%22%3A%22today+12-m%22%7D%2C%7B%22keyword%22%3A%22mercedes%22%2C%22geo%22%3A%22%22%2C%22time%22%3A%22today+12-m%22%7D%2C%7B%22keyword%22%3A%22bmw%22%2C%22geo%22%3A%22%22%2C%22time%22%3A%22today+12-m%22%7D%5D%2C%22category%22%3A0%2C%22property%22%3A%22%22%7D')
        const html = await page.content();
        console.log(html);
    }
    
    main()
    

    【讨论】:

      猜你喜欢
      • 2021-08-09
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-10-04
      相关资源
      最近更新 更多