【问题标题】:BS4: Finding the parent class in a Google SERP using PythonBS4:使用 Python 在 Google SERP 中查找父类
【发布时间】:2020-02-28 00:58:42
【问题描述】:

我希望从 Google 的搜索引擎中抓取头条新闻。问题是,当我创建一个 for 循环时,我得到一个“TypeError:find() 没有关键字参数”。

很简单,当我找到解决方案时,我只需从源代码中删除“.text”(代码如下所示)。但是当我这样做时,我得到一个不同的错误:“TypeError:'Response'类型的对象没有len()。我想知道是否有解决方法?我在下面提供的代码是“.text”包括在内。想知道是否有人能够找到解决方案。

from bs4 import BeautifulSoup
import requests

source = requests.get("https://www.google.com/search?q=online+education").text

for soup in BeautifulSoup(source, 'lxml'):
    headline = soup.find("div", class_="BNeawe vvjwJb AP7Wnd")
    print(headline)

我希望从 Google 的搜索引擎结果页面返回所有十个标题。

【问题讨论】:

    标签: python-3.x for-loop web-scraping beautifulsoup typeerror


    【解决方案1】:

    首先找到所有BNeawe vvjwJb AP7Wnd类,然后遍历所有结果。

    import requests
    from bs4 import BeautifulSoup
    
    source = requests.get("https://www.google.com/search?q=online+education").text
    
    soup = BeautifulSoup(source, 'lxml')
    
    headlines = soup.find_all("div", class_="BNeawe vvjwJb AP7Wnd")
    
    for headline in headlines:
        print(headline.text, end=', ')
    

    输出:

    What is online education | Definition of Online education is ..., Online Education | Encyclopedia.com, 5 Advantages Of Online Learning: Education Without Leaving Home ..., Online Education & Teaching Courses | Harvard University, What is online education? - Lynda.com, What is Online Education? - Online-Education.net, 50 Top Online Learning Sites - Best College Reviews, Benefits of Online Education | Community College of Aurora in ..., 10 Advantages of Taking Online Classes | OEDB.org, Online learning in higher education - Wikipedia, 
    

    【讨论】:

      【解决方案2】:

      尝试改用CSS 选择器和select()/select_one() bs4 方法。它们通常更快、更容易阅读,也更灵活。

      确保您在发出请求时使用user-agent,否则,Google 或其他网站最终会阻止您的请求。 What is user-agent I answered hereCheck what is your user-agent.

      在请求headers 中传递user-agent

      headers = {
          'User-agent':
          "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
      }
      
      requests.get(YOUR_URL, headers=headers)
      

      代码和full example in the online IDE

      from bs4 import BeautifulSoup
      import requests
      
      headers = {
          'User-agent':
          "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
      }
      
      params = {
        "q": "how to create minecraft server" # query
      }
      
      html = requests.get('https://www.google.com/search', headers=headers, params=params)
      soup = BeautifulSoup(html.text, 'lxml')
      
      for result in soup.select('.tF2Cxc'):
        title = result.select_one('.DKV0Md').text
        print(title)
      
      ------
      '''
      How to Setup a Minecraft: Java Edition Server – Home
      Download the Minecraft: Java Edition server
      Setting Up Your Own Minecraft Server - iD Tech
      Tutorials/Setting up a server - Minecraft Wiki
      How to make a Minecraft server on Windows, Mac, or Linux
      How To Make a Minecraft Server - The Ultimate 2021 Guide
      How To Make a Minecraft Server - The Complete Guide - Apex ...
      How to Setup a Minecraft Server on Windows 10 - ServerMania
      How to Create Your Own Minecraft Gaming Server | OVHcloud
      '''
      

      或者,您可以通过使用来自 SerpApi 的 Google Organic Results API 来实现相同的目的。这是一个带有免费计划的付费 API。

      核心区别在于您不必考虑解决过程中可能出现的一些问题,所需要做的只是迭代结构化 JSON 并获取您想要的数据,而不是制作所有内容从头开始并超时维护解析器。

      要集成的代码:

      import os
      from serpapi import GoogleSearch
      
      params = {
          "engine": "google",
          "q": "how to create minecraft server",
          "hl": "en",
          "api_key": os.getenv("API_KEY"),
      }
      
      search = GoogleSearch(params)
      results = search.get_dict()
      
      for result in results["organic_results"]:
        title = result['title']
        print(title)
      
      -------
      '''
      How to Setup a Minecraft: Java Edition Server – Home
      Download the Minecraft: Java Edition server
      Setting Up Your Own Minecraft Server - iD Tech
      Tutorials/Setting up a server - Minecraft Wiki
      How to make a Minecraft server on Windows, Mac, or Linux
      How To Make a Minecraft Server - The Ultimate 2021 Guide
      How To Make a Minecraft Server - The Complete Guide - Apex ...
      How to Setup a Minecraft Server on Windows 10 - ServerMania
      How to Create Your Own Minecraft Gaming Server | OVHcloud
      '''
      

      P.S - 我有一个关于 web scraping 的专门博客。

      免责声明,我为 SerpApi 工作。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-09-07
        • 2019-11-15
        • 1970-01-01
        • 2022-01-21
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多