【发布时间】:2021-06-14 05:27:16
【问题描述】:
我在迭代时遇到了IndexError 的问题。该程序运行良好,直到一切都完成,没有更多的“子网站”可以访问,然后它崩溃了,因此无法保存在 .txt 中。
回溯(最近一次通话最后一次)
newUrl = nextpage[counter]['href']
IndexError: list index out of range
代码
from urllib.request import urlopen, Request
from bs4 import BeautifulSoup
import json
class Olx():
def __init__(self, url):
self.url = url
def getPrice(self):
"""Get prices from olx"""
html = urlopen(self.url)
bs = BeautifulSoup(html, 'html.parser')
price = bs.findAll('p', class_='price')
return price
def nextPage(self):
"""Go to the next page"""
html = urlopen(self.url)
bs = BeautifulSoup(html, 'html.parser')
pageButton = bs.findAll('a', {'class': 'block br3 brc8 large tdnone lheight24'})
try:
return pageButton
except AttributeError:
None
else:
return pageButton
olxprices = Olx('https://www.olx.pl/nieruchomosci/mieszkania/wynajem/olsztyn/').getPrice()
nextpage = Olx('https://www.olx.pl/nieruchomosci/mieszkania/wynajem/olsztyn/').nextPage()
counter = 0
output = []
while len(nextpage) > 0:
for price in olxprices:
output.append(price.get_text().strip())
print(price.get_text().strip())
newUrl = nextpage[counter]['href']
olxprices = Olx(newUrl).getPrice()
counter += 1
print(output)
【问题讨论】:
标签: python loops beautifulsoup iteration index-error