【发布时间】:2021-04-08 16:24:27
【问题描述】:
我正在尝试抓取发布在 IMMOWEB 上 this link 下的所有广告的 href。 URL 由 Javascript 加载。我正在使用 HTMLSession 但无法获得我的结果。 这是我的代码:
url = 'https://www.immoweb.be/en/search/apartment/for-sale?countries=BE&isNewlyBuilt=false&maxBedroomCount=3&maxPrice=200000&maxSurface=130&minBedroomCount=1&minPrice=100000&minSurface=65&postalCodes=2000,2018,2060,2140,2170,2600,2610,2627,2640,2650,2660,2845,2850,2900,2980&page=1&orderBy=newest&card=9267356'
sessions = HTMLSession()
r = sessions.get(url)
r.html.render()
soup = BeautifulSoup(r.content, "html.parser")
print (soup)
需要的输出:
https://www.immoweb.be/en/classified/apartment/for-sale/antwerpen-merksem/2170/9268787?searchId=606f2c6d4c669
https://www.immoweb.be/en/classified/apartment/for-sale/merksem/2170/9268390?searchId=606f2c6d4c669
'And other hrefs'
【问题讨论】:
标签: python python-3.x beautifulsoup python-requests