没有跨度文本的 td 中的 Python/Selenium 文本答案

【问题标题】：Python/Selenium text in td without span text没有跨度文本的 td 中的 Python/Selenium 文本
【发布时间】：2020-04-29 08:46:39
【问题描述】：

html代码是

<td>
    <i class="fas fa-arrow-down arrow-green"></i>
    <span class="fs_buy">Strong Buy</span> 
    1.11
</td>

如果我使用此代码

cccss ='//*[@id="fs_title_values"]/div[3]/table/tbody/tr[1]/td[5]'
about = driver.find_element_by_xpath(cccss)
RatingCurrentValue=about.text
print ('RatingCurrentValue', RatingCurrentValue)

我会得到所有文本：RatingCurrentValue Strong Buy 1.11

我的目标是只得到 1.11 没有 span 标签中的文本。

请帮帮我。

【问题讨论】：

标签： python html selenium html-table

【解决方案1】：

从元素中提取文本1.11可以使用以下基于xpath的解决方案：

print(driver.find_element_by_xpath("//td[//span[@class='fs_buy' and text()='Strong Buy']]").get_attribute("innerHTML").splitlines()[2])

理想情况下，您必须为visibility_of_element_located() 诱导WebDriverWait，并且可以使用以下基于Locator Strategies 的XPATH：

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//td[//span[@class='fs_buy' and text()='Strong Buy']]"))).get_attribute("innerHTML").splitlines()[2])

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

【讨论】：

【解决方案2】：

获取值1.11 使用javascripts executor 并获取td 元素的lastChild。

诱导WebDriverWait()和visibility_of_element_located()

element=WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH,'//*[@id="fs_title_values"]/div[3]/table/tbody/tr[1]/td[5]')))
print(driver.execute_script('return arguments[0].lastChild.textContent;', element))

您需要添加以下库。

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

更新：

print(driver.execute_script('return arguments[0].lastChild.textContent;', driver.find_element_by_xpath('//*[@id="fs_title_values"]/div[3]/table/tbody/tr[1]/td[5]')))

【讨论】：

感谢您的快速回答。但是为这个简单的任务添加 3 个库是非常繁重的。有没有更简单的方法。
@EduardBauer ：这是在 selenium 中包含等待以避免任何页面同步问题的最佳实践。如果您不想使用它，只需尝试更新一个。

【解决方案3】：

您可以从全文中删除子节点文本以获取父节点文本。

cccss ='//*[@id="fs_title_values"]/div[3]/table/tbody/tr[1]/td[5]'


full_text = driver.find_element_by_xpath(cccss).text

child_text = driver.find_element_by_xpath(cccss + “//span”).text

parent_text = full_text.replace(child_text, '')
print(parent_text)

【讨论】：

谢谢。而已。但必须叫 parent_text = full_text.replace(child_text, '')
@EduardBauer meta.stackexchange.com/questions/5234/…