如何使用 Python 和 Selenium 遍历 webelements 列表？答案

【问题标题】：How do i iterate through a webelements list with Python and Selenium?如何使用 Python 和 Selenium 遍历 webelements 列表？
【发布时间】：2017-08-31 22:17:23
【问题描述】：

我想遍历一个 webelements 列表并从每个元素返回文本，但我只从第一个 <h2>element 获取文本，而不是从其他 <li> 标签内的其余元素获取文本，然后代码存在那个循环

这是我要从中提取文本的部分 Html 代码：

<div class="KambiBC-event-page-component__column KambiBC-event-page-component__column--1">
  
            <ul class="KambiBC-list-view__column">
              <li class="KambiBC-bet-offer-category KambiBC-collapsible-container KambiBC-expanded KambiBC-bet-offer-category--hidden KambiBC-bet-offer-category--fade-in">
                <header class="KambiBC-bet-offer-category__header" data-touch-feedback="true">
                  <h2 class="KambiBC-bet-offer-category__title js-bet-offer-category-title">Piete selectate</h2>
                </header>
              </li>
              <li class="KambiBC-bet-offer-category KambiBC-collapsible-container KambiBC-expanded KambiBC-bet-offer-category--hidden KambiBC-bet-offer-category--fade-in">
                 <header class="KambiBC-bet-offer-category__header" data-touch-feedback="true">
                  <h2 class="KambiBC-bet-offer-category__title js-bet-offer-category-title">Another text</h2>
                 </header>
              </li>

              <li class="KambiBC-bet-offer-category KambiBC-collapsible-container KambiBC-bet-offer-category--hidden KambiBC-bet-offer-category--fade-in">
                 <header class="KambiBC-bet-offer-category__header" data-touch-feedback="true">
                  <h2 class="KambiBC-bet-offer-category__title js-bet-offer-category-title">Different text</h2>
                 </header>
             </li>
                
              <li class="KambiBC-bet-offer-category KambiBC-collapsible-container KambiBC-bet-offer-category--hidden KambiBC-bet-offer-category--fade-in">
                 <header class="KambiBC-bet-offer-category__header" data-touch-feedback="true">
                  <h2 class="KambiBC-bet-offer-category__title js-bet-offer-category-title">Yet another text</h2>
                 </header>
              </li>
                
            </ul>
                  
      
      </div>

这是 Python 代码：

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Edge("D:\pariuri\python\MicrosoftWebDriver.exe")
driver.implicitly_wait(5)

driver.get("https://www.unibet.ro/betting#filter/football")

try:
    element_present = EC.presence_of_element_located((By.CLASS_NAME, 'KambiBC-event-result__score-list'))
    WebDriverWait(driver, 4).until(element_present)
except TimeoutException:
    print ('Timed out waiting for page to load') 

event = driver.find_elements_by_class_name('KambiBC-event-item KambiBC-event-item--type-match') 

for items in event:
   link = items.find_element_by_class_name('KambiBC-event-item__link')
   scoruri =  items.find_element_by_class_name('KambiBC-event-item__score-container') 
   
   scor1 =  scoruri.find_element_by_xpath(".//li[@class='KambiBC-event-result__match']/span[1]")
   scor2 =  scoruri.find_element_by_xpath(".//li[@class='KambiBC-event-result__match']/span[2]")
   
   print (scor1.text)
   print (scor2.text)
   if scor1.text == '0' and scor2.text == '0':
       

        link.click()
        time.sleep(3)


        PlajePariuri = driver.find_elements_by_xpath("//ul[@class='KambiBC-list-view__column']")
        for items in PlajePariuri:
             NumePlaje = items.find_element_by_xpath("//li/header/h2")
             print (NumePlaje.text)

【问题讨论】：

根据提供的 HTML，我在 li 标记中看不到任何此类 text。我错过了什么吗？
@DebanjanB 看看<h2> 标签，那些有那个类名
您提供的 HTML 仅包含一个您可以检索的 <h2> 标记。除非您提供更多 <h2> 标签，否则很难获得动态的 xpath
@DebanjanB 每个li 标记内部都有相同的header 和h2 元素，并且都具有第一个具有相同的类名，但具有不同的文本
@Rius2 html 代码不正确。缺少许多结束标记，并且似乎有许多嵌套列表，或者它们也没有正确关闭。除此之外，您的 Python 代码引用了根本没有出现在 HTML sn-p 中的属性。请提供与您需要帮助的 Python 代码相对应的有效 HTML sn-p。

标签： python selenium

【解决方案1】：

它一直在我面前，这将打印每个元素的文本，很高兴我能找到

PlajePariuri = driver.find_elements_by_class_name('KambiBC-bet-offer-category KambiBC-collapsible-container KambiBC-expanded KambiBC-bet-offer-category--hidden KambiBC-bet-offer-category--fade-in')


    for items2 in PlajePariuri:

        NumePlaje = items2.find_element_by_class_name('KambiBC-bet-offer-category__title js-bet-offer-category-title')

        print (NumePlaje.text)

【讨论】：

【解决方案2】：

试试下面的代码-

PlajePariuri = driver.find_elements_by_xpath("//ul[@class='KambiBC-list-view__column']//li/header/h2")
for items in PlajePariuri:
    print (items.text)

【讨论】：

html 代码中有更多文本，这将打印找到的所有文本，我需要 <h2> 之间的文本以及特定的类名
@Rius2——我不这么认为。你试过了吗？
我确定。正如我所说，实际网页在ul 列表中有更多包含更多文本的元素，我只发布了我目前感兴趣的部分。所以你的代码会打印一切。我试过了。

【解决方案3】：

不要使用classname 定位器，而是尝试使用xpath，如下所示：

PlajePariuri = driver.find_elements_by_xpath("//ul[@class='KambiBC-list-view__column']")
for items in PlajePariuri:
    NumePlaje = items.find_element_by_xpath("//li/header/h2")
    print (NumePlaje.text)

【讨论】：

有趣，现在我从第一个元素中获取文本三次，并且它存在循环
@Rius2 我得到同样的效果，重复的元素，你有没有找到解决方案？
@YodaScholtz 我找到了重复元素的解决方案。没有真正重复，但循环没有正确迭代WebElements。找到答案link

【解决方案4】：

我做了一个实现来查找列表中的元素。

我的情况是我们有一个带有侧列表的 wiki，该列表中可能有也可能没有列表，依此类推。这是我的解决方案：

// #Create a function to receive the old HTML (Before click), 
// #new HTML (After click), and the element I'm looking for:

def page_handler(old_source,new_source,element):
    new_content = []

    // #Put page into a list (need to verify if it works for you)
    old_page = old_source.split('\n')
    new_page = new_source.split('\n')

    // #Compare the old page and new page. The content of the new page, I check if 
    // #matches with the element I'm looking for

    for data in new_page:
        if data not in old_page:
            if element in data:
                new_content.append(data)

    return new_content

// #Now in the main thread, before the program Click on the item, take a snapshot:
old_page = driver.page_source

// #Click on the item
elem = driver.find_element_by_link_text(item).click()

// #take a new snapshot
new_page = driver.page_source

// # Use the function to send the old page and new page, and the class you are looking 
// #for in the HTML code:
new_pg_data = page_handler(old_page,new_page,'class="plugin_pagetree_children_span"')

// # Now I have the children elements, just iterate the list.

for element_id in new_pg_data:
    // #I use regexp to get the element ID
    element_id = search('id="(.*)">  ',element_id)
    if element_id:
        elem = driver.find_element_by_id(element_id).click()

我希望这个解决方案可以帮助你们。

【讨论】：