【发布时间】:2021-06-08 09:18:03
【问题描述】:
我正在尝试从我们的网页中抓取一些信息,例如标题、位置、联系电话。 我在 python 中使用了 Selenium 和 BS4 库。一旦我们点击网页中的“显示号码”元素,网页只会显示联系号码。 我尝试使用 selenium 单击,但它不起作用。
我的代码(我尝试过的):
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
#chrome_options.add_argument("--disable-extensions")
#chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox") # linux only
chrome_options.add_argument("--headless")
# chrome_options.headless = True # also works
#options.add_argument('--disable-gpu') # maybe needed if running on Windows.
driver = webdriver.Chrome(executable_path='/home/dobuyme/Desktop/chromedriver', options=chrome_options)
print("Loading Page...")
driver.get('https://smartarz.com/furniture/1114/furniture/3304/beds-bed-sets/3315/6443')
time.sleep(5)
element = driver.find_elements_by_xpath("/html/body/app-root/div/app-main/div/ng-component/app-sale-property-detail-page/div/main/div[2]/div[1]/div/app-post-detail-image-gallery/app-gallery/div/app-gallery-thumbnails/div/div")
element.click()
soup = BeautifulSoup(driver.page_source,"html.parser")
title = soup.find("span", {"class": ["_truncate_multilines multiline_truncation"]}).get_text()
app = soup.find("div", {"class": ["_top_row-destop text-left"]}).find("h2").get_text().strip()
driver.quit()
contact = soup.find("div", {"class": ["_user_contact"]}).find("p", {"class": ["_call text-left"]}).get_text()
print(contact)
我遇到了错误:
Loading Page...
Traceback (most recent call last):
File "/home/dobuyme/Downloads/BS4 Scrapper/Haraj.com/smart.py", line 18, in <module>
element.click()
AttributeError: 'list' object has no attribute 'click'
我不知道我的代码哪里出了问题。任何人都可以解决问题并帮助我打印联系电话吗?
谢谢
【问题讨论】:
标签: python selenium beautifulsoup