无法使用 Selenium webdriver 从表中提取 URL 名称答案

【问题标题】：Unable to extract URL names from table using Selenium webdriver无法使用 Selenium webdriver 从表中提取 URL 名称
【发布时间】：2021-06-11 06:01:05
【问题描述】：

我有一张如下表：

目标是使用 selenium webdriver 提取名称。

我尝试使用以下代码通过 xpath 获取名称：

wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wd.get("https://www.deakin.edu.au/information-technology/staff-listing")

names = wd.find_element_by_xpath('//*[@id="table09355"]/tbody/tr[1]/td/a').text

输出显示为空，即''。如何在 selenium webdriver 中使用 xpath 提取名称？名称是 URL 超链接。

谢谢，

【问题讨论】：

标签： python-3.x selenium selenium-webdriver xpath

【解决方案1】：

你可能想使用下面的 xpath ：

//a[contains(@href,'https://')]

并使用find_elements 将所有锚标记存储在这样的列表中：

for names in wd.find_elements(By.XPATH, "//a[contains(@href,'https://')]")
    print(names.text)

更新 1：

driver.maximize_window()
wait = WebDriverWait(driver, 10)
driver.get('https://www.deakin.edu.au/information-technology/staff-listing')
wait.until(EC.element_to_be_clickable((By.ID, "popup-accept"))).click()
ActionChains(driver).move_to_element(wait.until(EC.element_to_be_clickable((By.XPATH, "//span[text()='Emeritus Professors']")))).perform()
wait.until(EC.element_to_be_clickable((By.XPATH, "//span[text()='Emeritus Professors']"))).click()
ActionChains(driver).move_to_element(wait.until(EC.visibility_of_element_located((By.XPATH, "//span[contains(text(), 'Emeritus Professors')]/ancestor::h3/following-sibling::div/descendant::a")))).perform()
for names in driver.find_elements(By.XPATH, "//span[contains(text(), 'Emeritus Professors')]/ancestor::h3/following-sibling::div/descendant::a"):
    print(names.text)

O/P：

Emeritus Professor Lynn Batten
Emeritus Professor Andrzej Goscinski

Process finished with exit code 0

进口：

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

如果您想在 Google colab 上运行，请尝试以下代码：

!pip install selenium
!apt-get update 
!apt install chromium-chromedriver

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
driver =webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wait = WebDriverWait(driver, 10)
driver.get("https://www.deakin.edu.au/information-technology/staff-listing")
wait.until(EC.element_to_be_clickable((By.ID, "popup-accept"))).click()
ActionChains(driver).move_to_element(wait.until(EC.element_to_be_clickable((By.XPATH, "//span[text()='Emeritus Professors']")))).perform()
wait.until(EC.element_to_be_clickable((By.XPATH, "//span[text()='Emeritus Professors']"))).click()
ActionChains(driver).move_to_element(wait.until(EC.visibility_of_element_located((By.XPATH, "//span[contains(text(), 'Emeritus Professors')]/ancestor::h3/following-sibling::div/descendant::a")))).perform()
for names in driver.find_elements(By.XPATH, "//span[contains(text(), 'Emeritus Professors')]/ancestor::h3/following-sibling::div/descendant::a"):
    print(names.text)

【讨论】：

感谢您的回答。你能用上面的代码举例说明吗？我仍然无法获得，因为我是新手
@user3046211 : 更新更新 1 部分下的代码
知道了，但是当我运行您的代码时出现超时异常。你遇到过这样的例外吗？
啊！是吗？让我再试一次 Colab
建议进行异常处理