绕过循环 AttributeError：“NoneType”对象没有属性“findAll”答案

【问题标题】：Bypassing Loop AttributeError: 'NoneType' object has no attribute 'findAll'绕过循环 AttributeError：“NoneType”对象没有属性“findAll”
【发布时间】：2015-06-07 08:02:13
【问题描述】：

import requests
from bs4 import BeautifulSoup
import csv
from urlparse import urljoin
import urllib2


base_url = 'http://www.baseball-reference.com'
data = requests.get("http://www.baseball-reference.com/players/")
soup = BeautifulSoup(data.content)
player_url = 'http://www.baseball-reference.com/players/'
game_logs = 'http://www.baseball-reference.com/players/gl.cgi?id='
years = ['2000','2001','2002','2003','2004','2005','2005','2006','2007','2008','2009','2010','2011','2012','2013','2014','2015']
url = []
for link in soup.find_all('a'):
    if link.has_attr('href'):
        base_url + link['href']
        url.append(base_url + link['href'])
sink = []
for l in url:
    if l[0:42] in player_url:
        sink.append(l)
abc = []
for aa in sink:
    if len(aa) > 48:
        abc.append(aa)
urlz = []
for ab in abc:
    data = requests.get(ab)
    soup = BeautifulSoup(data.content)
    for link in soup.find_all('a'):
        if link.has_attr('href'):
            urlz.append(base_url + link['href'])
abc = []
for aa in urlz:
    if game_logs in aa:
        abc.append(aa)
urlll = []
for ab in years:
    for ac in abc:
        if ab in ac:
            urlll.append(ac)

for j in urlll:
    response = requests.get(j)
    html = response.content
    soup = BeautifulSoup(html)
    table = soup.find('table', attrs={'id': 'batting_gamelogs'})
    list_of_rows = []
    for row in table.findAll('tr'):
        list_of_cells = []
        for cell in row.findAll('td'):
            text = cell.text.replace('&nbsp;', '').encode("utf-8")
            list_of_cells.append(text)
        list_of_rows.append(list_of_cells)
    print list_of_rows

当我遍历 url 以获取表格时，有一些表格不存在的 url。我收到一个错误返回给我，看起来像：

Traceback (most recent call last):
  File "py5.py", line 55, in <module>
    list_of_cells.append(text)
AttributeError: 'NoneType' object has no attribute 'findAll'

有没有办法在没有桌子的情况下继续循环？

【问题讨论】：

使用try and except
使用Exception handling。
if whatever is None: continue?
该死的，我写完答案的那一刻，我看到每个人都评论了同样的想法。
我不明白这一行：- list_of_cells.append(text) 在属性 findAll 上出现错误。

标签： python loops error-handling web-scraping

【解决方案1】：

使用try and except 并处理错误

 for row in table.findAll('tr'):
        list_of_cells = []
        for cell in row.findAll('td'):
            text = cell.text.replace('&nbsp;', '').encode("utf-8")
            try:
                list_of_cells.append(text)
            except Exception, e:
                # handle exception
        list_of_rows.append(list_of_cells)

【讨论】：

没有工作，在没有属性的情况下循环通过 url 时，错误似乎在到达 try/exception 之前终止了循环。
你能提供一个失败页面的例子吗？
baseball-reference.com/players/… 是一个失败的页面，当我循环通过时，我希望循环在发生错误时移过错误。我怎么能这样做？