【问题标题】:how to do scraping for the no.of elements having the same div using python selenium如何使用 python selenium 抓取具有相同 div 的元素的数量
【发布时间】:2020-04-22 11:46:57
【问题描述】:

我正在尝试抓取该网站中的所有匹配项

https://web.bet9ja.com/Sport/OddsToday.aspx?IDSport=590

我需要的要求是

1.单击我共享的链接中的匹配名称,例如 kuttosh kujand 并抓取数据,然后再次导航并再次单击另一个匹配名称,该过程应针对链接中存在的匹配完成

到目前为止,我用我的代码编写了这样的代码,我能够完成我上面提到的匹配过程,但是我怎样才能完成所有匹配的过程

我写的代码:

# Here using selenium for scraping
# importing necessary modules
import selenium.webdriver
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import mysql.connector
import pymysql
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# the relevant url
url = 'https://web.bet9ja.com/Sport/OddsToday.aspx?IDSport=590'

# the driver path
driver = webdriver.Chrome(r"c:/Users/SATYA/mysite/chromedriver")
driver.get(url)
driver.implicitly_wait(10) # seconds
buttons = WebDriverWait(driver,15).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.Event.ng-binding")))
for btn in buttons:
    btn.click()
    headings= [item.text for item in driver.find_elements_by_css_selector("div.SECQ.ng-binding")]
    keys = [item.text for item in driver.find_elements_by_css_selector("div.SEOdd.g1")]
    values = [item.text for item in driver.find_elements_by_css_selector("div.SEOddLnk.ng-binding")]
    driver.execute_script("window.history.go(-1)")
    print(headings,keys,values)

谁能帮我解决这个问题

在抓取第一个匹配数据后,我编写的代码出现此错误

Traceback (most recent call last):
  File "dynamicscrape.py", line 21, in <module>
    btn.click()
  File "C:\Users\SATYA\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webelement.py", line 80, in click
    self._execute(Command.CLICK_ELEMENT)
  File "C:\Users\SATYA\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute
    return self._parent.execute(command, params)
  File "C:\Users\SATYA\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\SATYA\AppData\Local\Programs\Python\Python37\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
  (Session info: chrome=81.0.4044.113)

【问题讨论】:

  • StaleElementReferenceException 意味着您之前的 findByElement 命令已过时,现在您必须重新运行它......
  • @AmitJain 是的,即使不工作,我也会重新运行它
  • @AmitJain rerun 表示如何
  • try {WebElement element=findElement("xpath1"); // here page refreshed element.click(); } catch(StaleElementReferenceException){ // exception occurs - reexecute findElement("xpath1"); }
  • @AmitJain 我不明白我的代码中应该在哪里包含这些内容,如果您不介意,请说一下

标签: python-3.x selenium selenium-chromedriver webdriverwait


【解决方案1】:

stale element reference: element is not attached to the page document 错误发生在元素未附加到您在刷新页面后已捕获的页面上。

要克服这个问题,您需要再次重新分配元素以避免陈旧。

buttons = WebDriverWait(driver,15).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.Event.ng-binding")))
for btn in range(len(buttons)):
    #elements re-assigned again to avoid stale.
    buttons = WebDriverWait(driver, 15).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.Event.ng-binding")))
    buttons[btn].click()
    headings= [item.text for item in driver.find_elements_by_css_selector("div.SECQ.ng-binding")]
    keys = [item.text for item in driver.find_elements_by_css_selector("div.SEOdd.g1")]
    values = [item.text for item in driver.find_elements_by_css_selector("div.SEOddLnk.ng-binding")]
    driver.execute_script("window.history.go(-1)")
    print(headings,keys,values)

If 子句。

buttons = WebDriverWait(driver,15).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.Event.ng-binding")))
for btn in range(len(buttons)):
    buttons = WebDriverWait(driver, 15).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.Event.ng-binding")))
    if (btn==1) or (btn==3) or (btn==4):
        buttons[btn].click()
        headings= [item.text for item in driver.find_elements_by_css_selector("div.SECQ.ng-binding")]
        keys = [item.text for item in driver.find_elements_by_css_selector("div.SEOdd.g1")]
        values = [item.text for item in driver.find_elements_by_css_selector("div.SEOddLnk.ng-binding")]
        driver.execute_script("window.history.go(-1)")
        print(headings,keys,values)

【讨论】:

  • @Kunduk 为什么要写buttons[btn].click() 你能解释一下吗?
  • @sweety :如果您看到我使用的 for 循环 range() 因此 btn 是这里的索引。所以 buttons[btn] 将是 0,1,2....元素计数.
  • ok @Kundunk 如果我只想让 1,3,4 匹配项进行刮擦,那我该怎么办?请您说一下
  • @Kundunk 我接受了你的回答,如果你不介意的话可以说一下吗
  • @sweety : 你需要把 if 子句像这样if (btn==1) or (btn==3) or (btn==5) :
猜你喜欢
  • 2018-04-04
  • 1970-01-01
  • 2020-07-31
  • 2017-03-22
  • 1970-01-01
  • 1970-01-01
  • 2019-11-22
  • 2019-07-20
  • 1970-01-01
相关资源
最近更新 更多