【发布时间】:2021-02-19 03:14:51
【问题描述】:
我正在尝试使用 selenium 在多个页面上搜索亚马逊的产品价格。我能够获取产品名称和产品价格的所有元素,但是在从中提取文本时,Selenium 会引发错误。
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
这是我的代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException, StaleElementReferenceException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from openpyxl import Workbook
import time
driver = webdriver.Chrome(r'C:\Users\varun\OneDrive\Documents\python projects\chromedriver.exe')
url = 'https://www.amazon.in/'
driver.get(url)
driver.find_element(By.XPATH, "//input[@id='twotabsearchtextbox']").send_keys("oppo mobile")
driver.find_element(By.XPATH, "//input[@value='Go']").click()
brand = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//span[text() = 'Oppo']")))
brand.click()
ele = driver.find_element(By.XPATH, "//ul[@class='a-pagination']/li[6]")
url_list = []
products_list = []
prices_list = []
for page in range(int(ele.text)):
page_ = page+1
url_list.append(driver.current_url)
prod_name_list = driver.find_elements(By.XPATH, "//span[@class='a-size-medium a-color-base a-text-normal']")
prod_prices_list = driver.find_elements(By.XPATH, "//span[@class='a-price-whole']")
driver.implicitly_wait(4)
products_list = products_list + prod_name_list
prices_list = prices_list + prod_prices_list
try:
driver.find_element(By.XPATH, "//li[@class='a-last']").click()
print("page " + str(page_) + " is grabbed.")
print(driver.current_url)
except NoSuchElementException:
print("All pages are collected!")
time.sleep(5)
print("---------------------------------------------------")
print(products_list)
print("---------------------------------------------------")
print(prices_list)
product_name = []
prices = []
for product in products_list:
product_name.append(product.text)
for price in prices_list:
prices.append(price.text)
print(product_name)
print(prices)
错误信息出现在这一行:
for product in products_list:
product_name.append(product.text)
for price in prices_list:
prices.append(price.text)
我尝试通过放置隐式等待来减慢抓取速度,然后也会弹出错误。请帮我解决这个错误。 谢谢!
【问题讨论】:
-
您应该在前面的 for 循环中附加文本。如果你缩进你的 for 循环,使它们与前一个循环分开,它将起作用。
-
是的,它起作用了!!!...谢谢! @ArundeepChohan
-
driver.implicitly_wait(4) 是你只设置一次的东西。
-
对不起?..我没听懂你。@ArundeepChohan
-
这是你不需要在循环中设置的东西,你可以把它拿出来。
标签: python selenium selenium-webdriver web-scraping xpath