【问题标题】:Scraping data from yahoo finance从雅虎财经抓取数据
【发布时间】:2021-03-24 18:42:51
【问题描述】:

我一直在尝试从 yahoo Finance 中抓取数据,但每次尝试都会出现以下错误:-

Traceback (most recent call last):   
  File "C:\Users\nnarn\PycharmProjects\papaproject\main.py", line 15, in <module>
    print(str(parsePrice()))   
  File "C:\Users\nnarn\PycharmProjects\papaproject\main.py", line 8, in parsePrice
    soup=bs4.BeautifulSoup(r.text, "xml")   
  File "C:\Users\nnarn\AppData\Local\Programs\Python\Python39\lib\site-packages\bs4\__init__.py", line 243, in __init__
    raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: xml. Do you need to install a parser library?

代码:

import bs4
import requests
from bs4 import BeautifulSoup
    
def parsePrice():
   r=requests.get('https://finance.yahoo.com/quote/FB?p=FB')
   soup=bs4.BeautifulSoup(r.text, "xml")
   price=soup.find('div',{'class':'D(ib) Mend(20px)'})[0].find('span').text
   print(price)
   return price
    
while True:
   print(str(parsePrice()))

【问题讨论】:

    标签: python


    【解决方案1】:

    BS4 documentation 建议您使用soup = BeautifulSoup(r.text, 'html.parser'),因为您正在下载的页面内容是 HTML,而不是 XML。

    【讨论】:

      【解决方案2】:

      只需删除“xml”,您的soup.find 也包含错误。先find_allspans,然后从列表中选择你需要的:

      import bs4
      import requests
      
      from bs4 import BeautifulSoup
      
      def parsePrice():
          r=requests.get('https://finance.yahoo.com/quote/FB?p=FB')
          soup=bs4.BeautifulSoup(r.text)
          price=soup.find('div',{'class':'D(ib) Mend(20px)'}).find_all('span')[0].text
          print(price)
          return price
      
      while True:
          print(str(parsePrice()))
      

      【讨论】:

        猜你喜欢
        • 2020-04-01
        • 1970-01-01
        • 2020-09-10
        • 1970-01-01
        • 2017-01-14
        • 2023-03-27
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多