BeautifulSoup 类发现返回无答案

【问题标题】：BeautifulSoup class find returning NoneBeautifulSoup 类发现返回无
【发布时间】：2019-04-22 11:15:03
【问题描述】：

我正在编写一个 python 程序，使用 BeautifulSoup，它将检索网站上的下载链接。我正在使用 find 方法检索链接所在的 html 类，但它返回 None。

我曾尝试使用父类访问此类，但没有成功。

这是我的代码

link = 'https://data.worldbank.org/topic/agriculture-and-rural-development?view=chart'

for link in indicator_links:
    indicator_page = requests.get(link)
    indicator_soup = BeautifulSoup(page.text, 'html.parser')
    download = indicator_soup.find(class_="btn-item download")

再次，我希望下载链接位于btn-item download html 类中。

【问题讨论】：

什么是indicator_links？

标签： python web-scraping beautifulsoup

【解决方案1】：

您是指btn-item download html 类中的所有链接吗？

用这个改变你的代码：

link = 'https://data.worldbank.org/topic/agriculture-and-rural-development?view=chart'

page = requests.get(link)
indicator_soup = BeautifulSoup(page.text, 'html.parser')
download = indicator_soup.find(class_="btn-item download")
for lnk in download.find_all('a', href=True):
    print(lnk['href'])

【讨论】：

【解决方案2】：

问题是我使用错误的 html 参数创建 BeautifulSoup 对象。应该是：

indicator_soup = BeautifulSoup(indicator_page.text, 'html.parser')

而不是

indicator_soup = BeautifulSoup(page.text, 'html.parser')

【讨论】：

【解决方案3】：

如果你想要一个链接，它将 100% 在标记中。这是我能提供的最好的帮助：

from bs4 import BeautifulSoup
import urllib.request

page_url = "https://data.worldbank.org/topic/agriculture-and-rural-development?view=chart"
soup = BeautifulSoup(urllib.request.urlopen(page_url), 'lxml')

what_you_want = soup.find('a', clas_="btn-item download")

这应该会给你你想要的链接。

由于我不知道 indicator_links 是什么，因此不确定您要在代码中做什么。

【讨论】：

开头的语法无效。
具体在哪里？