【发布时间】:2021-09-24 14:34:30
【问题描述】:
Andrej 帮助我编写了这段代码,但现在我想知道如何导航到每个页面并下载名称中包含文本/标题“Public Comment”的所有 PDF?
import requests
from bs4 import BeautifulSoup
url = "https://www.ci.atherton.ca.us/Archive.aspx?AMID=41"
key = "Archive.aspx?ADID="
soup = BeautifulSoup(requests.get(url).content, "html.parser")
for link in soup.find_all("a"):
if key in link.get("href", ""):
print("https://www.ci.atherton.ca.us/" + link.get("href"))
打印:
https://www.ci.atherton.ca.us/Archive.aspx?ADID=3581
https://www.ci.atherton.ca.us/Archive.aspx?ADID=3570
https://www.ci.atherton.ca.us/Archive.aspx?ADID=3564
https://www.ci.atherton.ca.us/Archive.aspx?ADID=3559
https://www.ci.atherton.ca.us/Archive.aspx?ADID=3556
https://www.ci.atherton.ca.us/Archive.aspx?ADID=3554
https://www.ci.atherton.ca.us/Archive.aspx?ADID=3552
【问题讨论】:
-
字符串“Public Comments”应该在哪里?你能给出示例网址吗?
-
标签: python html web-scraping beautifulsoup hyperlink