【发布时间】:2018-06-13 19:37:10
【问题描述】:
这段代码有问题:
i = range(0, 51)
page_number = 1
with open('hltb data/HLTB.csv','w') as f:
thewriter = csv.writer(f)
thewriter.writerow(['Game Name:', 'Game Length:', 'Game Developer:', "Game Publisher:", 'Game Genre:', 'Game Console:', 'URL:'])
for element in i:
url = 'https://howlongtobeat.com/game.php?id=' + format(page_number)
response = get(url)
html_soup = BeautifulSoup(response.text, 'html.parser')
page_number += 1
try:
game_name = html_soup.select('div.profile_header')[0].text
except:
game_name = "Game Name not found"
try:
game_length = html_soup.select('div.game_times li div')[-1].string
except:
game_length = "Game length not found"
try:
game_developer = html_soup.find_all('strong', string='\nDeveloper:\n')[0].next_sibling
except:
game_developer = "Game developer not found"
try:
game_publisher = html_soup.find_all('strong', string='\nPublisher:\n')[0].next_sibling
except:
game_publisher = "Game Publisher not found"
try:
game_console = html_soup.find_all('strong', string='\nPlayable On:\n')[0].next_sibling
except:
game_console = "Game Playable on not found"
try:
game_genres = html_soup.find_all('strong', string='\nGenres:\n')[0].next_sibling
except:
game_genres = "Game Genres found"
print(url)
print(game_name)
print(game_length)
print(game_developer)
print(game_publisher)
print(game_genres)
print(game_console)
row = [game_name, game_length, game_developer, game_publisher, game_genres, game_console, url]
thewriter.writerow(row)
我在运行代码时收到此错误:
ValueError Traceback (最近一次调用最后一次) in () 46 47 行 = [游戏名称、游戏长度、游戏开发者、游戏发布者、游戏类型、游戏控制台、网址] ---> 48 thewriter.writerow(row)
ValueError: 对已关闭文件的 I/O 操作。
我之前有它工作过。
如何进行数据抓取并将信息传输到电子表格中以便我可以操作数据?
【问题讨论】:
标签: python-3.x csv web-scraping