如何使用 Selenium 和 Python 从多个 span 元素中提取文本内容答案

【问题标题】：How to extract the text content from multiple span elements using Selenium and Python如何使用 Selenium 和 Python 从多个 span 元素中提取文本内容
【发布时间】：2020-11-03 07:01:54
【问题描述】：

如何使用 selenium 从多个 DIV 元素中选择文本内容？

我打算在网站上收集包含div和span同一个类的信息。

如何单独收集这些信息？

我需要 panel-body div 中的内容 > 每个块的跨度

driver.find_element_by_xpath(".//div[@class='panel-body'][1]/span[1]").text
driver.find_element_by_xpath(".//div[@class='panel-body'][1]/span[2]").text
driver.find_element_by_xpath(".//div[@class='panel-body'][1]/span[3]").text

driver.find_element_by_xpath(".//div[@class='panel-body'][2]/span[1]").text
driver.find_element_by_xpath(".//div[@class='panel-body'][2]/span[2]").text

html

        <div class="panel-heading">
            <h3 class="panel-title">Identificação</h3>
        </div>


        <div class="panel-body">
            <span class="spanValorVerde">TEXT</span><br>
            <span style="font-size:small;color:gray">TEXT</span><br>
            <br>
            <span class="spanValorVerde">TEXT</span>
        </div>


    </div>

    <div class="panel panel-success">

    
        <div class="panel-heading">
            <h3 class="panel-title">Situação Atual</h3>
        </div>


        <div class="panel-body">
            <span class="spanValorVerde">TEXT</span> <br>
            <span class="spanValorVerde">TEXT</span>
        </div>


    </div>

【问题讨论】：

我希望“选择文本”的意思是“获取文本”，如果您使用带有“.//div[@class='panel-body'][i]” 的 findElements 进行搜索会给你存在的总元素，然后为 .//div[@class='panel-body'][i]/span[j] 添加另一个循环，然后获取文本。希望对您有所帮助！

标签： python-3.x selenium xpath css-selectors webdriverwait

【解决方案1】：

我希望“选择文本”意味着“获取文本”。

第一个for循环：

count = driver.find_elements_by_xpath(".//div[@class='panel-body'][i]")

第二个 for 循环，count 迭代：

driver.find_element_by_xpath(".//div[@class='panel-body'][i]/span[j]").text

如果您使用带有".//div[@class='panel-body'][i]" 的 findElements 进行搜索，则会为您提供存在的总元素，然后为 .//div[@class='panel-body'][i]/span[j] 添加另一个循环，然后获取文本。希望对您有所帮助！

【讨论】：

【解决方案2】：

提取文本，例如TEXT，从使用 Selenium 和 python 的每个 <span> 中，您必须为 visibility_of_all_elements_located() 诱导 WebDriverWait，您可以使用以下任一 Locator Strategies：

使用CSS_SELECTOR 和get_attribute("innerHTML")：

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.panel.panel-success div.panel-body span.spanValorVerde")))])

使用XPATH和text属性：

print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='panel panel-success']//div[@class='panel-body']//span[@class='spanValorVerde']")))])

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

结尾

链接到有用的文档：

get_attribute() 方法Gets the given attribute or property of the element.
text 属性返回 The text of the element.
Difference between text and innerHTML using Selenium

【讨论】：