【发布时间】:2021-09-30 13:37:53
【问题描述】:
对于我的计算项目,我正在尝试制作一个财务预测网站。代码中的元素之一是 Web 抓取 API。 它从 Yahoo Finance 上一家公司的损益表中抓取数据。
但是,即使 URL 正确,我仍然会不断收到 404 错误。
我的代码
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
import warnings
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
income_url = 'http://uk.finance.yahoo.com/quote/AAPL/financials?p=AAPL'
read_url = ur.urlopen(income_url).read()
income_soup = BeautifulSoup(read_url, 'lxml')
div_list = []
for div in income_soup.find_all('div'):
div_list.append(div.string)
if not div.string == div.get('title'):
div_list.append(div.get('title'))
div_list = [incl for incl in div_list if incl not in
('Operating Expenses', 'Non-recurring Events', 'Expand All')]
div_list = list(filter(None, div_list))
div_list = [incl for incl in div_list if not incl.startswith('(function')]
income_list = div_list[13: -5]
income_list.insert(0, 'Breakdown')
income_data = list(zip(*[iter(income_list)]*6))
income_df = pd.DataFrame(income_data)
headers = income_df.iloc[0]
income_df = income_df[1:]
income_df.columns = headers
income_df.set_index('Breakdown', inplace=True, drop=True)
warnings.warn('Amounts are in thousands.')
print(income_df)
我不断收到此错误:
urllib.error.HTTPError:HTTP 错误 404:未找到错误
如何解决?
【问题讨论】:
-
@AndyKnight 我将如何更改重定向以帮助代码运行?
标签: python web-scraping http-status-code-404 yahoo-finance