【问题标题】:Selenium: Next pages with javascript throwing ElementClickInterceptedException ErrorSelenium:带有 javascript 的下一页抛出 ElementClickInterceptedException 错误
【发布时间】:2022-01-16 10:17:58
【问题描述】:

我的代码运行良好,但 pagination portion 抛出以下异常:

selenium.common.exceptions.ElementClickInterceptedException: 
Message: element click intercepted: 
Element <a href="#cpricehistory" data-toggle="tab" class="pill-item" id="btn_cpricehistory" aria-expanded="true">...</a> 
is not clickable at point (165, 19). 
Other element would receive the click: <a href="#">...</a>

非常感谢您的帮助

脚本:

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException


url = 'https://www.sharesansar.com/company/shl'

cdm = ChromeDriverManager().install()
driver = webdriver.Chrome(cdm)

driver.maximize_window()
time.sleep(8)
driver.get(url)
time.sleep(10)
data =[]

while True:
    driver.find_element_by_link_text('Price History').click()
    time.sleep(3)

    select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@name="myTableCPriceHistory_length"]'))))
    select.select_by_visible_text("50")

    soup = BeautifulSoup(driver.page_source,'lxml')

    tables =soup.select('#myTableCPriceHistory tbody tr')

    for table in tables:
        _open = table.select_one('td:nth-child(3)').text
        high = table.select_one('td:nth-child(4)').text
        low = table.select_one('td:nth-child(5)').text
        close = table.select_one('td:nth-child(6)').text

        print ( f"""
        Opening:{_open}
        High:{high}
        Low:{low} 
        """)

    print("-" * 85)

    
    # next_page=driver.find_element_by_xpath('//a[contains(text(),"Next")]')
    # if next_page:
    #     next_page.click()
    #     time.sleep(3)
    # else:
    #     break
#while True:
    try:
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@class="dataTables_paginate paging_simple_numbers"]/span/following-sibling::a'))).click()
        print("Clicked on  Next Page »")
    except TimeoutException:
        print("No more Next Page »")
        break
driver.quit()

【问题讨论】:

  • 始终将完整的错误消息(从单词“Traceback”开始)作为文本(不是屏幕截图,不是指向外部门户的链接)(不是在 cmets 中)。还有其他有用的信息。
  • 完整的错误消息显示再次单击 'Price History' 时出现问题,但您无需再次单击即可获得下一页。您应该只单击一次 - 在while-loop 之前。与选择 50 相同 - 您应该只选择一次 - 在 while-loop 之前。
  • 昨天是类似的问题,我展示了如何使用requests 而不是Seleniumscrape responsive table from site whose url doesnt change
  • TimeoutException 是错误的想法,因为按钮Next 也存在于最后一页上,它将一次又一次地加载到最后一页。您必须检查它是否有类 disabled 或使用 '//a[@class="paginate_button next"]'

标签: python selenium


【解决方案1】:

完整的错误消息显示再次单击 'Price History' 时出现问题,但您无需再次单击它即可获得下一页。您应该只单击一次 - 在while-loop 之前。

selecting 50 也是如此。您应该只选择一次 - 在while-loop 之前。

其他问题导致Next Page,因为它甚至存在于最后一页,并且一次又一次点击Next Page,它一次又一次加载最后一页。

通常这个按钮有 "paginate_button next" 类,但在最后一页它有 "paginate_button next disabled" 类 - 所以如果你要搜索 "paginate_button next" 类,那么你应该检测到最后一页

'//a[@class="paginate_button next"]'

完整的工作代码:

from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException, TimeoutException
from bs4 import BeautifulSoup
import time

url = 'https://www.sharesansar.com/company/shl'

cdm = ChromeDriverManager().install()
driver = webdriver.Chrome(cdm)

driver.maximize_window()
driver.get(url)
time.sleep(10)

data = []

driver.find_element_by_link_text('Price History').click()
time.sleep(3)

select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@name="myTableCPriceHistory_length"]'))))
select.select_by_visible_text("50")

while True:
    
    soup = BeautifulSoup(driver.page_source, 'lxml')

    tables = soup.select('#myTableCPriceHistory tbody tr')

    for table in tables:
        _open = table.select_one('td:nth-child(3)').text
        high = table.select_one('td:nth-child(4)').text
        low = table.select_one('td:nth-child(5)').text
        close = table.select_one('td:nth-child(6)').text

        print(f"Opening: {_open}\nHigh: {high}\nLow: {low}\n")

    print("-" * 85)
    
    try:
        WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, '//a[@class="paginate_button next"]'))).click()
        print("Clicked on Next Page »")
        time.sleep(5)  # page needs time to load new data
    except TimeoutException:
        print("No more Next Page »")
        break
        
driver.quit()

顺便说一句:

昨天是类似的问题,我展示了如何仅使用requests 而不是Selenium 来获取此表。它直接从 API 获取 JSON 数据,因此不需要BeautifulSoup

scrape responsive table from site whose url doesnt change

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-06-15
    • 2022-08-14
    • 2015-06-29
    • 1970-01-01
    • 2020-07-11
    • 2021-11-03
    • 1970-01-01
    相关资源
    最近更新 更多