HTTP 错误 508：检测到循环 python urllib.request答案

【问题标题】：HTTP Error 508: Loop Detected python urllib.requestHTTP 错误 508：检测到循环 python urllib.request
【发布时间】：2020-01-03 12:15:19
【问题描述】：

我正在用下面的代码抓取一个网站，在我运行两次后，第三次显示错误

HTTP 错误 508：检测到循环

req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
soup=BeautifulSoup(webpage)

liList=soup.find('div',attrs={'class':'columns-list'})
links=[]
for a in liList.find_all('a'):

    req = Request(a.attrs['href'], headers={'User-Agent': 'Mozilla/5.0'})
    webpage = urlopen(req).read()
    data=BeautifulSoup(webpage)
    h=data.find("div",attrs={'class':'first-h2'})

    print(h.h2.text)
    print(data.find("h5"))

如何预防？有时它会起作用，而有时它会给出这个错误

【问题讨论】：

标签： python web-scraping web-crawler urllib

【解决方案1】：

猜测是“Internal Server Error”的一种，表示服务器进入了循环，如下所述：https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/508

It indicates that the server terminated an operation because it encountered
an infinite loop while processing a request with "Depth: infinity". This
status indicates that the entire operation failed.

所以，这是一个服务器错误，不是你的

【讨论】：