【发布时间】:2020-07-23 04:32:14
【问题描述】:
我想将公司名称、人员、国家、电话和电子邮件提取到 excel 文件中。我尝试了以下代码,但它在 excel 文件中只返回一个值。如何在第一页和下一页也循环播放..
import csv
import re
import requests
import urllib.request
from bs4 import BeautifulSoup
for page in range(10):
url = "http://www.aepcindia.com/buyersdirectory"
soup = BeautifulSoup(urllib.request.urlopen(url).read(), 'lxml')
tbody = soup('div', {'class':'view-content'})#[0].find_all('')
f = open('filename.csv', 'w', newline = '')
Headers = "Name,Person,Country,Email,Phone\n"
csv_writer = csv.writer(f)
f.write(Headers)
for i in tbody:
try:
name = i.find("div", {"class":"company_name"}).get_text()
person = i.find("div", {"class":"title"}).get_text()
country = i.find("div", {"class":"views-field views-field-field-country"}).get_text()
email = i.find("div", {"class":"email"}).get_text()
phone = i.find("div", {"class":"telephone_no"}).get_text()
print(name, person, country, email, phone)
f.write("{}".format(name).replace(","," ")+ ",{}".format(person)+ ",{}".format(country)+ ",{}".format(email) + ",{}".format(phone) + "\n")
except: AttributeError
f.close()
【问题讨论】:
标签: python beautifulsoup screen-scraping