【发布时间】:2021-09-12 09:53:39
【问题描述】:
在使用 requests_html 访问此链接 https://www.dickssportinggoods.com/f/tents-accessories?pageNumber=2 时,我需要等待一段时间才能真正加载页面。这有可能吗? 我的代码:
from requests_html import HTMLSession
from bs4 import BeautifulSoup
from lxml import etree
s = HTMLSession()
response = s.get(
'https://www.dickssportinggoods.com/f/tents-accessories?pageNumber=2')
response.html.render()
soup = BeautifulSoup(response.content, "html.parser")
dom = etree.HTML(str(soup))
item = dom.xpath('//a[@class="rs_product_description d-block"]/text()')[0]
print(item)
【问题讨论】:
-
那个答案说要使用“r.html.render()”,我已经在这样做了。
-
@Ibstam Ch pip install requests-html from requests_html import HTMLSession from requests_html import AsyncHTMLSession
-
我认为你没有添加 requests-html
标签: python web-scraping python-requests-html