在 Python 中使用 selenium 仅提取 <a> 标记中的链接答案

【问题标题】：Extracting just the link in the <a> tag using selenium in Python在 Python 中使用 selenium 仅提取 <a> 标记中的链接
【发布时间】：2021-05-07 11:17:01
【问题描述】：

我在 python 中使用 selenium 来做一些网页抓取，我只想在这里获取链接。

<ul class="liste-sous-menu">
   <li class="target Menu" id="summary1">
       <a href="../associations/formalites-administratives-association">Formalités administratives d'une 
       association</a>
       <ul class="ul-dossier">
          <li>
              <a href="../associations/creation-association">Création</a></li>
      </ul>
  </li>
</ul>

我只对 ID 为 summary1 的标签中的链接感兴趣，而不是第二个无序列表中提到的其他链接由于我有一个以summary 开头的id 很长的列表，我做了这段代码，但是在引用它时，我得到的只是文本而不是链接，你还有其他建议吗？

list_of_services = driver.find_elements_by_class_name("liste-sous-menu")
for service in list_of_services:
    # In each element, select the tags
    atags = service.find_elements_by_xpath("//li[starts-with(@id,'summary')]")
    for atag in atags:
        # In each atag, select the href
        href = atag.get_attribute('href')
        # Open a new window
        driver.execute_script("window.open('');")
        # Switch to the new window and open URL
        driver.switch_to.window(driver.window_handles[1])
        driver.get(href)
        sleep(3)

所以当我想获取链接时，我得到了这个错误

selenium.common.exceptions.InvalidArgumentException: Message: invalid argument: 'url' must be a string
  (Session info: chrome=88.0.4324.104)

【问题讨论】：

你在一个没有href的li类中。所以先去a标签，然后获取href。
这是我使用atag.get_attribute 所做的？
这就是您当前所在的位置。打印出该 get_attribute 下的 href。