'list' 对象在遍历 WebElements 时没有属性 'get_attribute'答案

【问题标题】：'list' object has no attribute 'get_attribute' while iterating through WebElements'list' 对象在遍历 WebElements 时没有属性 'get_attribute'
【发布时间】：2018-05-23 22:33:40
【问题描述】：

我正在尝试使用 Python 和 Selenium 来抓取网页上的多个链接。我正在使用find_elements_by_xpath，并且我能够找到一个元素列表，但是我无法更改返回到实际href 链接的列表。我知道find_element_by_xpath 有效，但这只适用于一个元素。

这是我的代码：

path_to_chromedriver = 'path to chromedriver location'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)

browser.get("file:///path to html file")

all_trails = []

#finds all elements with the class 'text-truncate trail-name' then 
#retrieve the a element
#this seems to be just giving us the element location but not the 
#actual location

find_href = browser.find_elements_by_xpath('//div[@class="text truncate trail-name"]/a[1]')
all_trails.append(find_href)

print all_trails

此代码正在返回：

<selenium.webdriver.remote.webelement.WebElement 
(session="dd178d79c66b747696c5d3750ea8cb17", 
element="0.5700549730549636-1663")>, 
<selenium.webdriver.remote.webelement.WebElement 
(session="dd178d79c66b747696c5d3750ea8cb17", 
element="0.5700549730549636-1664")>,

我希望all_trails 数组是一个链接列表，例如：www.google.com, www.yahoo.com, www.bing.com。

我尝试循环遍历 all_trails 列表并在列表中运行 get_attribute('href') 方法，但出现错误：

有人知道如何将 selenium WebElement 转换为 href 链接吗？

任何帮助将不胜感激:)

【问题讨论】：

注意find_elements_by_xpath是复数；它返回一个列表。当您将生成的内容附加到列表时，您将获得列表列表（不是列表）。
请在此处粘贴您的 html

标签： python selenium selenium-webdriver web-scraping selenium-chromedriver

【解决方案1】：

find_href = browser.find_elements_by_xpath('//div[@class="text truncate trail-name"]/a[1]')
for i in find_href:
      all_trails.append(i.get_attribute('href'))

get_attribute 作用于该列表的元素，而不是列表本身。

【讨论】：

【解决方案2】：

如果您有以下 HTML：

<div class="text-truncate trail-name">
<a href="http://google.com">Link 1</a>
</div>
<div class="text-truncate trail-name">
<a href="http://google.com">Link 2</a>
</div>
<div class="text-truncate trail-name">
<a href="http://google.com">Link 3</a>
</div>
<div class="text-truncate trail-name">
<a href="http://google.com">Link 4</a>
</div>

您的代码应如下所示：

all_trails = []

all_links = browser.find_elements_by_css_selector(".text-truncate.trail-name>a")

for link in all_links:

    all_trails.append(link.get_attribute("href"))

其中 all_trails -- 是链接列表（链接 1、链接 2 等）。

希望对你有帮助！

【讨论】：

由于某种原因，这实际上没有返回。 DebanjanB 的解决方案在上面有效。感谢您的帮助，您肯定为我指明了正确的方向。

【解决方案3】：

让我们看看你的代码发生了什么：

在对相关HTML 没有任何可见性的情况下，以下行似乎将两个WebElements 返回到List find_href，它们又被附加到 @987654325 @ List ：

find_href = browser.find_elements_by_xpath('//div[@class="text truncate trail-name"]/a[1]')

因此，当我们打印 List all_trails 时，WebElements 都会被打印出来。因此没有错误。

根据您提供的错误快照，您正试图通过 不支持 的 List 调用 get_attribute("href") 方法。因此，您会看到错误：

'List' Object has no attribute 'get_attribute'

解决办法：

要获得 href 属性，我们必须迭代 List，如下所示：

find_href = browser.find_elements_by_xpath('//your_xpath')
for my_href in find_href:
    print(my_href.get_attribute("href"))

【讨论】：

【解决方案4】：

以单数形式将其用作find_element_by_css_selector，而不是使用find_elements_by_css_selector，因为它会在列表中返回许多webElement。所以你需要遍历每个 webElement 才能使用 Attribute。

【讨论】：