【发布时间】:2020-04-11 19:30:26
【问题描述】:
我正在学习使用 Beautifulsoup 抓取网站,并试图从 yahoo Finance 获取数据。随着我的前进,我一直想知道当我不在 for 循环中时它是否有理由成功获取我想要的东西(正在寻找特定的股票代码),但是一旦我尝试让它使用 csv要搜索多个股票代码的文件,.find() 方法会返回错误而不是我要查找的标签。
这是运行良好时的代码,
```
import requests
import csv
from bs4 import BeautifulSoup
> ------ FOR LOOP THAT MESSES THINGS UP ----- <
# with open('s&p500_tickers.csv', 'r') as tickers:
# for ticker in tickers:
ticker = 'AAPL' > ------ TEMPORARY TICKER TO TEST CODE
web = requests.get(f'https://ca.finance.yahoo.com/quote/{ticker}/financials?p={ticker}').text
soup = BeautifulSoup(web, 'lxml')
section = soup.find('section', class_='smartphone_Px(20px) Mb(30px)')
tbl = section.find('div', class_='M(0) Whs(n) BdEnd Bdc($seperatorColor) D(itb)')
headerRow = tbl.find("div", class_="D(tbr) C($primaryColor)")
> ------ CODE I USED TO VISUALIZE THE RESULT ------ <
breakdownHead = headerRow.text[0:9]
ttmHead = headerRow.text[9:12]
lastYear = headerRow.text[12:22]
twoYears = headerRow.text[22:32]
threeYears = headerRow.text[32:42]
fourYears = headerRow.text[42:52]
print(breakdownHead, ttmHead, lastYear, twoYears, threeYears, fourYears)
```
它返回这个:
```
Breakdown ttm 2019-09-30 2018-09-30 2017-09-30 2016-09-30
Process finished with exit code 0
```
这是不起作用的代码
```
import requests
import csv
from bs4 import BeautifulSoup
with open('s&p500_tickers.csv', 'r') as tickers:
for ticker in tickers:
web = requests.get(f'https://ca.finance.yahoo.com/quote/{ticker}/financials?p={ticker}').text
soup = BeautifulSoup(web, 'lxml')
section = soup.find('section', class_='smartphone_Px(20px) Mb(30px)')
tbl = section.find('div', class_='M(0) Whs(n) BdEnd Bdc($seperatorColor) D(itb)')
headerRow = tbl.find("div", class_="D(tbr) C($primaryColor)")
breakdownHead = headerRow.text[0:9]
ttmHead = headerRow.text[9:12]
lastYear = headerRow.text[12:22]
twoYears = headerRow.text[22:32]
threeYears = headerRow.text[32:42]
fourYears = headerRow.text[42:52]
print(breakdownHead, ttmHead, lastYear, twoYears, threeYears, fourYears)
```
欢迎对我的代码提供任何反馈,因为我一直在努力变得更好。
非常感谢
【问题讨论】:
标签: python web-scraping beautifulsoup