在 BeautifulSoup.findAll 函数中捕获异常答案

【问题标题】：catch exception in BeautifulSoup.findAll function在 BeautifulSoup.findAll 函数中捕获异常
【发布时间】：2013-06-06 14:05:56
【问题描述】：

我正在尝试通过提取table 中的城市和区号来抓取此afghanistan page。现在，当我尝试抓取这个american-samoa page 时，findAll() 找不到<td>，这是真的。如何捕捉这个异常？

这是我的代码：

from bs4 import BeautifulSoup                                                                                                                                                                                                                
import urllib2                                                                                                                                                                                                                               
import re                                                                                                                                                                                                                                    

url = "http://www.howtocallabroad.com/american-samoa"
html_page = urllib2.urlopen(url)
soup = BeautifulSoup(html_page)

areatable = soup.find('table',{'id':'codes'})
d = {}

def chunks(l, n):
    return [l[i:i+n] for i in range(0, len(l), n)]

li = dict(chunks([i.text for i in areatable.findAll('td')], 2))
if li != []:
    print li

    for key in li:
            print key, ":", li[key]
else:
    print "list is empty"

这是我遇到的错误

Traceback (most recent call last):
  File "extract_table.py", line 15, in <module>
    li = dict(chunks([i.text for i in areatable.findAll('td')], 2))
AttributeError: 'NoneType' object has no attribute 'findAll'

我也试过了，但也没用

def gettdtag(tag):
    return "empty" if areatable.findAll(tag) is None else tag

all_td = gettdtag('td')
print all_td

【问题讨论】：

标签： python beautifulsoup

【解决方案1】：

错误说areatable是None：

areatable = soup.find('table',{'id':'codes'})
#areatable = soup.find('table', id='codes')  # Also works

if areatable is None:
    print 'Something happened'
    # Exit out

另外，我会使用find_all 而不是findAll 和get_text() 而不是text。

【讨论】：

该死，我几乎写了同样的答案。
关于使用某些功能而不是其他功能，that would 是my fault :P
@Haidro 那将是我的下一个待办事项，知道find_all 之间的差异而不是findAll 一步一步学习python =)
@zipc 我认为这只是 findAll 是 BeautifulSoup 版本 3 的一部分，find_all 是版本 4 的一部分。但是 findAll 在 4 中仍然有效。看看here