【问题标题】:extracting data from website using beautifulsoup4 and parse into csv使用 beautifulsoup4 从网站中提取数据并解析为 csv
【发布时间】:2020-03-10 20:15:30
【问题描述】:
【问题讨论】:
标签:
csv
parsing
beautifulsoup
【解决方案1】:
from bs4 import BeautifulSoup
import requests
import csv
r = requests.get(
"https://www.berufsstart.de/unternehmen/bundesland/baden-wuerttemberg-top-100.php")
soup = BeautifulSoup(r.text, 'html.parser')
numbers = []
names = []
cities = []
for num in soup.findAll("div", class_="col-sm-2"):
num = num.get_text(strip=True, separator=",")
if num:
numbers.append(num.split(',')[1])
for name in soup.findAll("strong", class_="h2"):
names.append(name.text)
for city in soup.findAll("div", class_="col-sm-5 infobereich"):
cities.append(city.get_text(strip=True, separator=" ").split(" ")[1])
with open("kas.csv", 'w', newline="") as f:
writer = csv.writer(f)
writer.writerow(["Name", "City", "Number"])
for a, b, c in zip(names, cities, numbers):
writer.writerow([a, b, c])
print("Done")
输出:view-online