Selenium：带有 javascript 的下一页抛出 ElementClickInterceptedException 错误答案

【问题标题】：Selenium: Next pages with javascript throwing ElementClickInterceptedException ErrorSelenium：带有 javascript 的下一页抛出 ElementClickInterceptedException 错误
【发布时间】：2022-01-16 10:17:58
【问题描述】：

我的代码运行良好，但 pagination portion 抛出以下异常：

selenium.common.exceptions.ElementClickInterceptedException: 
Message: element click intercepted: 
Element <a href="#cpricehistory" data-toggle="tab" class="pill-item" id="btn_cpricehistory" aria-expanded="true">...</a> 
is not clickable at point (165, 19). 
Other element would receive the click: <a href="#">...</a>

非常感谢您的帮助

脚本：

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException


url = 'https://www.sharesansar.com/company/shl'

cdm = ChromeDriverManager().install()
driver = webdriver.Chrome(cdm)

driver.maximize_window()
time.sleep(8)
driver.get(url)
time.sleep(10)
data =[]

while True:
    driver.find_element_by_link_text('Price History').click()
    time.sleep(3)

    select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@name="myTableCPriceHistory_length"]'))))
    select.select_by_visible_text("50")

    soup = BeautifulSoup(driver.page_source,'lxml')

    tables =soup.select('#myTableCPriceHistory tbody tr')

    for table in tables:
        _open = table.select_one('td:nth-child(3)').text
        high = table.select_one('td:nth-child(4)').text
        low = table.select_one('td:nth-child(5)').text
        close = table.select_one('td:nth-child(6)').text

        print ( f"""
        Opening:{_open}
        High:{high}
        Low:{low} 
        """)

    print("-" * 85)

    
    # next_page=driver.find_element_by_xpath('//a[contains(text(),"Next")]')
    # if next_page:
    #     next_page.click()
    #     time.sleep(3)
    # else:
    #     break
#while True:
    try:
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@class="dataTables_paginate paging_simple_numbers"]/span/following-sibling::a'))).click()
        print("Clicked on  Next Page »")
    except TimeoutException:
        print("No more Next Page »")
        break
driver.quit()

【问题讨论】：

始终将完整的错误消息（从单词“Traceback”开始）作为文本（不是屏幕截图，不是指向外部门户的链接）（不是在 cmets 中）。还有其他有用的信息。
完整的错误消息显示再次单击 'Price History' 时出现问题，但您无需再次单击即可获得下一页。您应该只单击一次 - 在while-loop 之前。与选择 50 相同 - 您应该只选择一次 - 在 while-loop 之前。
昨天是类似的问题，我展示了如何使用requests 而不是Selenium：scrape responsive table from site whose url doesnt change
TimeoutException 是错误的想法，因为按钮Next 也存在于最后一页上，它将一次又一次地加载到最后一页。您必须检查它是否有类 disabled 或使用 '//a[@class="paginate_button next"]'

标签： python selenium

【解决方案1】：

完整的错误消息显示再次单击 'Price History' 时出现问题，但您无需再次单击它即可获得下一页。您应该只单击一次 - 在while-loop 之前。

selecting 50 也是如此。您应该只选择一次 - 在while-loop 之前。

其他问题导致Next Page，因为它甚至存在于最后一页，并且一次又一次点击Next Page，它一次又一次加载最后一页。

通常这个按钮有 "paginate_button next" 类，但在最后一页它有 "paginate_button next disabled" 类 - 所以如果你要搜索 "paginate_button next" 类，那么你应该检测到最后一页

'//a[@class="paginate_button next"]'

完整的工作代码：

from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException, TimeoutException
from bs4 import BeautifulSoup
import time

url = 'https://www.sharesansar.com/company/shl'

cdm = ChromeDriverManager().install()
driver = webdriver.Chrome(cdm)

driver.maximize_window()
driver.get(url)
time.sleep(10)

data = []

driver.find_element_by_link_text('Price History').click()
time.sleep(3)

select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@name="myTableCPriceHistory_length"]'))))
select.select_by_visible_text("50")

while True:
    
    soup = BeautifulSoup(driver.page_source, 'lxml')

    tables = soup.select('#myTableCPriceHistory tbody tr')

    for table in tables:
        _open = table.select_one('td:nth-child(3)').text
        high = table.select_one('td:nth-child(4)').text
        low = table.select_one('td:nth-child(5)').text
        close = table.select_one('td:nth-child(6)').text

        print(f"Opening: {_open}\nHigh: {high}\nLow: {low}\n")

    print("-" * 85)
    
    try:
        WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, '//a[@class="paginate_button next"]'))).click()
        print("Clicked on Next Page »")
        time.sleep(5)  # page needs time to load new data
    except TimeoutException:
        print("No more Next Page »")
        break
        
driver.quit()

顺便说一句：

昨天是类似的问题，我展示了如何仅使用requests 而不是Selenium 来获取此表。它直接从 API 获取 JSON 数据，因此不需要BeautifulSoup。

scrape responsive table from site whose url doesnt change

【讨论】：