【问题标题】:Extract contents after click() with Python Selenium使用 Python Selenium 在 click() 后提取内容
【发布时间】:2023-03-06 22:33:01
【问题描述】:

我想提取一个人 Herbert W. Gullquist 的简历。在此网页点击他的名字(“Gullquist 是首席投资官和 Lazard Asset Management 的普通合伙人......”)后,它来自“经理时间线”:https://www.morningstar.com/funds/xnas/lziex/people

代码找不到那个人。是不是因为代码点错了地方?

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options)
driver.get("https://www.morningstar.com/funds/xnas/lziex/people")

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='sal-component-ctn sal-component-manager-timeline-chart']//text[text()='Gullquist']/.."))).click()
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.sal-modal-biography.ng-binding.ng-scope"))).text.strip())

此外,如果我想要经理时间轴中每个人的简历(总共 9 人)而不是某个人怎么办?非常感谢任何帮助。

【问题讨论】:

    标签: python selenium svg modal-dialog webdriverwait


    【解决方案1】:

    错误的xpath 值在该行:

    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='sal-component-ctn sal-component-manager-timeline-chart']//text[text()='Gullquist']/.."))).click()
    

    使用以下值更改:

    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//*[text()='Herbert W. Gullquist']"))).click()
    

    【讨论】:

      【解决方案2】:

      有 6 位现任经理和 3 位前任经理。获取两个链接

      current= driver.find_elements(By.XPATH, "//*[@class='current-manager-timeline']//*[name()='g']//*[name()='text']") past = driver.find_elements(By.XPATH, "//*[@class='past-manager-timeline']//*[name()='g']//*[name()='text']") total_links = current+past

      获取链接后,您可以循环访问链接并提取生物

      bio = driver.find_element(By.XPATH, "//div[contains(@class, 'biography')]").text

      【讨论】:

      • 非常感谢。我可以使用第一块代码来获取链接。如何循环浏览链接?
      【解决方案3】:

      要从网页https://www.morningstar.com/funds/xnas/lziex/people 中提取“Herbert W. Gullquist”的简历并关闭弹出窗口,因为该元素位于 <tag> 标记内,您需要诱导 WebDriverWaitelement_to_be_clickable(),您可以使用以下Locator Strategies

      • 代码块:

        options = webdriver.ChromeOptions() 
        options.add_argument("start-maximized")
        options.add_experimental_option("excludeSwitches", ["enable-automation"])
        options.add_experimental_option('useAutomationExtension', False)
        driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
        driver.get("https://www.morningstar.com/funds/xnas/lziex/people")
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='sal-component-ctn sal-component-manager-timeline-chart']//*[name()='svg']//*[name()='g' and @class='past-manager-timeline']//*[text()='Herbert W. Gullquist']"))).click()
        print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.sal-modal-biography.ng-binding.ng-scope"))).text.strip())
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='sal-component-ctn sal-modal-scrollable']//div[@class='sal-manager-modal__modalHeader']//button[@class='sal-icons sal-icons--close mds-button mds-button--icon-only']"))).click()
        
      • 控制台输出:

        Gullquist is chief investment officer and a general partner with Lazard Asset Management, his employer since 1982. Previously, he spent 12 years as general partner, managing director, and chief investment officer of Oppenheimer & Company. Prior to that, from 1970 to 1971, he served as the director of Stuyvesant Asset Management, a company he founded. He has also worked at First National Bank of Chicago.
        

      您可以在How to click on SVG elements using XPath and Selenium WebDriver through Java找到相关讨论

      【讨论】:

      • 非常感谢!您能否在提取生物后帮我关闭弹出窗口?我试过 WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "svg.mds-icon.mds-button__icon.mds-button__icon--left"))).click() 但它没有点击关闭按钮
      • @AlanZ 添加了在提取生物后关闭弹出窗口的代码行。查看更新的答案,让我知道状态。
      猜你喜欢
      • 1970-01-01
      • 2022-01-12
      • 1970-01-01
      • 2015-10-28
      • 2019-11-05
      • 2019-08-20
      • 2020-08-25
      • 2013-06-06
      • 1970-01-01
      相关资源
      最近更新 更多