Python & Selenium：如何从特定 div 获取 href 元素？我的元素已过时答案

【问题标题】：Python & Selenium: How can I obtain a href element from a specific div? My element is stalePython & Selenium：如何从特定 div 获取 href 元素？我的元素已过时
【发布时间】：2020-03-19 23:18:46
【问题描述】：

我不熟悉使用 selenium 和 python 进行网络抓取。我试图从中抓取数据的网页在我试图访问的特定 div 中有 href 元素。我尝试使用 find_element_by_xpath() 来获取它，但是它表明它找不到元素。然后我尝试使用该类查找 div 并从中找到 href，但它表明我的元素已过时。我很难理解为什么它是陈旧的，因为我发现第二种方法似乎适用于教程/stackoverflow 上的人。

基本的 HTML 是这样的：

    <div class=div1>
        <ul>
            <li>
                <a href='path/to/div1stuff/1'>Generic string 1</a>
                <a href='path/to/div1stuff/2'>Generic string 2</a>
                <a href='path/to/div1stuff/3'>Generic string 3</a>
            </li>
        </ul>            
    </div>

    <div class=div2>
        <ul>
            <li>
                <a href='path/to/div2stuff/1'>Generic string 1</a>
                <a href='path/to/div2stuff/2'>Generic string 2</a>
                <a href='path/to/div2stuff/3'>Generic string 3</a>
            </li>
        </ul>            
    </div>

还有我的python代码：

class Scraper(object):
    def __init__(self):
        pass

    def execute(self):
        """ Run class methods """

        self.home = "https://www.website2scrape.com/"

        self.get_stuff()


    def get_stuff(self):
        """ Get stuff """

        driver = webdriver.Firefox("/usr/local/bin/")
        driver.get(self.home)

        # Example 1 
        driver.find_element_by_xpath("//div[@class='div2']//a[contains(@href,'Generic string 2')]").click()

        # Example 2
        elements = driver.find_elements_by_css_selector("div.div2")
        for element in elements:
            print(element.get_attribute("href"))

示例 1 给出了错误元素找不到。

示例 2 给出了元素过期的错误

我正在尝试单击 div2 中的通用字符串 2 href，但是如果我只是通过使用获取 href：

driver.find_element_by_xpath('//a[contains(@href, "Generic string 2")]')

它点击来自 div1 的 href。如何从特定的 div 类中获取 href？

【问题讨论】：

总是将完整的错误消息（从单词“Traceback”开始）作为文本（不是屏幕截图）放在有问题的（不是评论）中。还有其他有用的信息。
是您在 div 中而不是在 a 中搜索 href 的第二个示例 - 您应该尝试 "div.div2 a"
在第一个例子中你必须使用text()，而不是@href

标签： python selenium web-scraping

【解决方案1】：

在第一个示例中，您必须使用 text() 而不是 @href

driver.find_element_by_xpath("//div[@class='div2']//a[contains(text(),'Generic string 2')]").click()

在第二个示例中，您在 div 中搜索 href，但它在 a 中，因此您必须将 a 添加到选择器

elements = driver.find_elements_by_css_selector("div.div2 a")

最少的工作代码：

import selenium.webdriver

driver = selenium.webdriver.Firefox()

html_content = """
    <div class=div1>
        <ul>
            <li>
                <a href='path/to/div1stuff/1'>Generic string 1</a>
                <a href='path/to/div1stuff/2'>Generic string 2</a>
                <a href='path/to/div1stuff/3'>Generic string 3</a>
            </li>
        </ul>            
    </div>

    <div class=div2>
        <ul>
            <li>
                <a href='path/to/div2stuff/1'>Generic string 1</a>
                <a href='path/to/div2stuff/2'>Generic string 2</a>
                <a href='path/to/div2stuff/3'>Generic string 3</a>
            </li>
        </ul>            
    </div>
"""

driver.get("data:text/html;charset=utf-8," + html_content)

elements = driver.find_elements_by_css_selector("div.div2 a")
for x in elements:
    print(x.get_attribute('href'))

item = driver.find_element_by_xpath("//div[@class='div2']//a[contains(text(),'Generic string 2')]")
print(item.get_attribute('href'))
item.click()

【讨论】：

感谢您的帮助！对于第二个示例，我仍然收到错误：selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //div[@class='div2']//a[contains(text(),'Generic string 2')]
也许 HTML 看起来有点不同 - 总是有问题你应该添加 url 以便我们可以看到真正的 HTML。
并且有问题的您应该始终输入完整的错误消息（从单词“Tracerback”开始）。可能还有其他有用的信息。

【解决方案2】：

请在 xpath 下方找到点击 div 2 标签下的第二个链接。

解决方案 1：

 element = driver.findElement(By.xpath("//div[@class='div2']//ul//li//a[2]"));
 element.click()

如果你想基于文本点击，你可以使用下面的代码

解决方案 2：

driver.find_element_by_xpath("//div[@class='div2']//a[contains(text(),'Generic string 2')]").click()

基于 href 元素的点击

解决方案 3：

driver.find_element_by_xpath("//div[@class='div2']//ul/li//a[contains(@href,'path/to/div2stuff/2')]").click()

【讨论】：