【发布时间】:2019-07-25 20:22:28
【问题描述】:
这是我第一次网络抓取。我遵循了一个教程,但我试图抓取一个不同的页面,我得到以下信息:
gamesplayed = 数据[1].getText()
IndexError: 列表索引超出范围
这是目前为止的代码
from bs4 import BeautifulSoup
import urllib.request
import csv
urlpage = 'https://www.espn.com/soccer/standings/_/league/FIFA.WORLD/fifa-world-cup'
page = urllib.request.urlopen(urlpage)
soup = BeautifulSoup(page, 'html.parser')
#print(soup)
table = soup.find('table', attrs={'class': 'Table2__table__wrapper'})
results = table.find_all('tr')
#print('Number of results:', len(results))
rows = []
rows.append(['Group A', 'Games Played', 'Wins', 'Draws', 'Losses', 'Goals For', 'Goals Against', 'Goal Difference', 'Points'])
print(rows)
# loop over results
for result in results:
# find all columns per result
data = result.find_all('td')
# check that columns have data
if len(data) == 0:
continue
# write columns to variables
groupa = data[0].getText()
gamesplayed = data[1].getText()
wins = data[2].getText()
draws = data[3].getText()
losses = data[4].getText()
goalsfor = data[5].getText()
goalsagainst = data[6].getText()
goaldifference = data[7].getText()
point = data[8].getText()
【问题讨论】:
-
当您在调试器中检查
data时,它说它包含什么?
标签: python python-3.x web web-scraping beautifulsoup