如何捕获“NoneType”对象在综合列表中没有“get”属性答案

【问题标题】：How to catch 'NoneType' object has no attribute 'get' in a comprehensive list如何捕获“NoneType”对象在综合列表中没有“get”属性
【发布时间】：2019-07-02 11:20:54
【问题描述】：

我想从网站上抓取网址。我正在使用 beautifulsoup4。

我试图抓取的结构是这样的： HTML Structure

我使用的代码是这样的：

soup = BeautifulSoup(response.text, "html.parser")
all_urls = [x.p.a.get('href') for x in soup.findAll("div", class_="b-accordion__text")]

当我运行脚本时，我收到以下错误：

'NoneType' object has no attribute 'get'

这可能是因为某些 div 是空的并且不包含 p/a，因此在不存在的对象上调用了 get 函数。

 <div class="b-accordion__text">
</div>

当我尝试添加一个 if 表达式时：

all_urls = [x.p.a.get('href') for x in soup.findAll("div", class_="b-accordion__text") if x.p.a]

然后我收到不存在的错误：

'NoneType' object has no attribute 'a'

由于我是 Python 的超级新手，我不知道如何处理这个错误。我本来预计会收到警告，指出某些元素没有 p/a 并且脚本仍会运行。但它中止了。

问题：如何处理/捕获空 div 标签的错误？

【问题讨论】：

all_urls = [x.p.a.get('href') for x in soup.findAll("div", class_="b-accordion__text") if x.p.a]。虽然我对 python 不太陌生，但这应该会有所帮助，即if x.p.a
@shahkalpesh 不幸的是没有工作。我已将此添加到原始帖子中。
糟糕，应该是if x.p

标签： python python-3.x beautifulsoup

【解决方案1】：

我还没有测试代码，但您可以在列表理解中添加一个条件，如下所示：

soup = BeautifulSoup(response.text, "html.parser")
all_urls = [x.p.a.get('href') for x in soup.findAll("div", class_="b-accordion__text") if not x.p.a is None]

更一般地说，要测试特定属性，您可以使用 hasattr 内置函数。

【讨论】：

与原帖中@shahkalpesh 的评论相同。还是不行。

【解决方案2】：

在综合列表中添加双 if 语句以检查它是否具有“p”和“a”属性解决了该问题：

all_urls = [x.p.a.get('href') for x in soup.findAll("div", class_="b-accordion__text") if x.p and x.p.a]

【讨论】：