【发布时间】:2021-05-17 17:22:53
【问题描述】:
如何更快地获得beautifulsoup 刮刀? 这段代码看起来很慢,有什么方法可以更快?
def getNews():
tic=time.perf_counter()
requests_session = requests.Session()
scrapy = requests.get('https://www.marketwatch.com/markets?mod=top_nav ').content
product = SoupStrainer('div', {'id': 'collection__elements j-scrollElement'})
soup = BeautifulSoup(scrapy, 'lxml')
for div in soup.findAll('div', attrs={'class': 'collection__elements j-scrollElement'}):
for div in div.findAll('div', attrs={'class':'article__content'}):
for div2 in div.find_all('h3', attrs={'class':'article__headline'}):
for a in div2.find_all('a', href=True):
if a.text:
print(a.text)
print(a['href'])
toc=time.perf_counter()
print(toc-tic)
【问题讨论】:
-
“这段代码看起来很慢”。但是是吗?请定义“慢”。
-
执行时间过长
-
定义“太长”
标签: python python-3.x beautifulsoup request lxml