【发布时间】:2019-09-02 01:55:27
【问题描述】:
我正在尝试使用 beautifulsoup 学习网页抓取,并且我已经实现了这段代码。但是只有电影标题被写入 csv 文件,而不是流派,尽管它们都已被检索。
网址:http://www.imdb.com/search/title?sort=num_votes,desc&start=1&title_type=feature&year=1950,2012
f = csv.writer(open('movie-names.csv', 'w'))
f.writerow(['Title', 'Genre'])
pages = []
genre;
for i in range(1,2):
url = 'http://www.imdb.com/search/title?sort=num_votes,desc&start=1&title_type=feature&year=1950,2012'
pages.append(url)
for item in pages:
page = requests.get(item)
soup = BeautifulSoup(page.text, 'html.parser')
movie_titles = soup.find_all(class_ = 'lister-item-content')
for movie_title in movie_titles:
title = movie_title.find('a').contents[0]
genre = movie_title.find_all(class_ = 'genre')[0].get_text()
print(genre)
f.writerow([title, genre])
【问题讨论】:
-
你代码开头的
genre;是不是错字?
标签: python python-3.x csv web-scraping beautifulsoup