【发布时间】:2021-04-07 02:53:56
【问题描述】:
我想提取网站中的所有评论。该网站使用 iframe 作为评论部分。我已经尝试使用硒刮掉它。但不幸的是,我只能抓取 1 条评论。如何抓取评论的其余部分并将其归档到 csv 或 xmls?
- 代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
page = driver.get("https://finance.detik.com/berita-ekonomi-bisnis/d-5307853/ri-disebut-punya-risiko-korupsi-yang-tinggi?_ga=2.13736693.357978333.1608782559-293324864.1608782559")
iframe = WebDriverWait(driver,20).until(EC.presence_of_element_located((By.XPATH, "//iframe[@class='xcomponent-component-frame xcomponent-visible']")))
driver.switch_to.frame(iframe)
xpath = '//*[@id="cmt66363941"]/div[1]/div[1]'
extract_name = WebDriverWait(driver,20).until(EC.presence_of_element_located((By.XPATH, xpath)))
username=extract_name.text
xpath = '//*[@id="cmt66363941"]/div[1]/div[2]'
extract_comment = WebDriverWait(driver,20).until(EC.presence_of_element_located((By.XPATH, xpath)))
comment=extract_comment.text
print(username, comment)
- 输出
King Akbarmachinery
3 hari yang lalu selama korupsi tidak dihukum mati disanalah korupsi masih liar dan ada kalaupun dibuat hukum mati setidaknya bisa mengurangi angka korupsi itu
Laporkan
2BalasBagikan:
顺便问一下,如何从输出中删除这一行?
Laporkan
2BalasBagikan:
【问题讨论】:
标签: python selenium web-scraping iframe