【发布时间】:2020-12-16 11:40:04
【问题描述】:
我正在尝试在这个网站上抓取一些信息:https://fr.trustpilot.com/review/jardiland.com
到目前为止,这是我的脚本:
import requests
from requests import get
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
urls = ["https://fr.trustpilot.com/review/jardiland.com",
"https://fr.trustpilot.com/review/jardiland.com?page=2",
"https://fr.trustpilot.com/review/jardiland.com?page=3",
"https://fr.trustpilot.com/review/jardiland.com?page=4",
"https://fr.trustpilot.com/review/jardiland.com?page=5",
"https://fr.trustpilot.com/review/jardiland.com?page=6",
"https://fr.trustpilot.com/review/jardiland.com?page=7",
"https://fr.trustpilot.com/review/jardiland.com?page=8"]
comms = []
notes = []
for url in urls :
results = requests.get(url)
soup = BeautifulSoup(results.text, "html.parser")
commentary = soup.find_all('p', class_='review-content__text')
for container in commentary:
comm = container.text
comms.append(comm)
ratings = soup.find_all('div', class_='star-rating star-rating--medium')
for container2 in ratings:
rating = container2.text
notes.append(rating)
data = pd.DataFrame({
'comms' : comms,
'notes' : notes})
data['comms'] = data['comms'].str.replace('\n', '')
#print(data.head())
data.to_csv('file.csv', sep=';', index=False)
这是我的结果:output
我得到了评论但没有得到评分,我不太清楚如何获得它。
这里是代码源:codesource
我想要:“1 étoile : mauvais”,但结构很棘手。
任何想法如何做到这一点?
谢谢。
【问题讨论】:
标签: python html python-3.x web-scraping beautifulsoup