【问题标题】:Pythons, Selenium and the XPATH of a new windowPython、Selenium 和新窗口的 XPATH
【发布时间】:2018-11-03 18:42:07
【问题描述】:

当一个元素直到你使用 .click() 方法并且该元素位于 JavaScript 的一部分(称为 BODY_BLOCK_JQUERY_REFLOW)之后才显示时,如何通过 xapth 抓取一个元素。

我正在尝试访问这部分 html。

<div class="ui_radio item" data-value="it" data-tracker="Italian">
    <input id="filters_detail_language_filterLang_it" type="radio" name="filters_detail_language_filterLang_1" value="it" onchange="widgetEvCall('handlers.updateFilter', event, this);">
 <label for="filters_detail_language_filterLang_it" class="label">Italian <span class="count">(11)</span>
 </label>
</div>

我可以访问之前的语言 1 - 3,但是当我选择第 4 种语言(以及更多)时,我无法解析 xpath,因为它显示为覆盖。

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import os
import time
from lxml import html


chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--windows-size=1080*720")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-proxy-server")

headers = {'User-Agent': ''}
proxies = {"http": ''}
chrome_driver = os.getcwd() + "/chromedriver"
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=chrome_driver)
driver.get("https://www.tripadvisor.com/Attraction_Review-g60776-d117416-Reviews-Colorado_National_Monument-Fruita_Colorado.html")

# here we click on the more languages element
driver.find_element_by_xpath("""//*[@id="taplc_detail_filters_ar_responsive_0"]/div/div[1]/div/div[2]/div[4]/div/div[2]/div[1]/div[5]""").click()

html_thing = driver.page_source

innerHTML = driver.execute_script("return document.body.innerHTML")
parser = html.fromstring(html_thing)


#T hese XPATHS work since they are part of the DOM on intial load
XPATH_LANG1 = '//*[@id="taplc_detail_filters_ar_responsive_0"]/div/div[1]/div/div[2]/div[4]/div/div[2]/div[1]/div[2]/label/text()'
XPATH_LANG_COUNT1 = '//*[@id="taplc_detail_filters_ar_responsive_0"]/div/div[1]/div/div[2]/div[4]/div/div[2]/div/div[2]/label/span//text()'
XPATH_LANG2 = '//*[@id="taplc_detail_filters_ar_responsive_0"]/div/div[1]/div/div[2]/div[4]/div/div[2]/div[1]/div[3]/label/text()'
XPATH_LANG_COUNT2 = '//*[@id="taplc_detail_filters_ar_responsive_0"]/div/div[1]/div/div[2]/div[4]/div/div[2]/div/div[3]/label/span//text()'
XPATH_LANG3 = '//*[@id="taplc_detail_filters_ar_responsive_0"]/div/div[1]/div/div[2]/div[4]/div/div[2]/div[1]/div[4]/label/text()'
XPATH_LANG_COUNT3 = '//*[@id="taplc_detail_filters_ar_responsive_0"]/div/div[1]/div/div[2]/div[4]/div/div[2]/div[1]/div[4]/label/span//text()'


# Unfortunately, these XPATHS dont work. Im assuming because they are in this JQUERY thing.
XPATH_LANG4 = """//*[@id="BODY_BLOCK_JQUERY_REFLOW"]/div[12]/div[2]/div/div[5]/label/text()"""

print(XPATH_LANG4, 'this is lang 4')

raw_lang1 = parser.xpath(XPATH_LANG1)
print(raw_lang1)
raw_lang_count1 = parser.xpath(XPATH_LANG_COUNT1)
print(raw_lang_count1)
raw_lang2 = parser.xpath(XPATH_LANG2)
print(raw_lang2)
raw_lang_count2 = parser.xpath(XPATH_LANG_COUNT2)
print(raw_lang_count2)
raw_lang3 = parser.xpath(XPATH_LANG3)
print(raw_lang3)
raw_lang_count3 = parser.xpath(XPATH_LANG_COUNT3)
print(raw_lang_count3)
raw_lang4 = parser.xpath(XPATH_LANG4)
if not raw_lang4:
    print(raw_lang4, '<--------------- THIS IS EMPTY')
else:
    print(raw_lang4, 'It actually showed up')

driver.close()
driver.quit()

我试过使用`driver.find_element_by_xpath(""""""),我试过解析器,以及我能想到的一切。

问题似乎在于,虽然语言(在本例中为“Italian”(叠加层中的第 4 种语言))位于页面源代码中,但 XPATH 却看不到它。这是一个挑战,因为该页面使用动态 id 或根本不使用。

【问题讨论】:

    标签: javascript python selenium xpath selenium-chromedriver


    【解决方案1】:

    尝试点击“更多语言”按钮并等待所需选项出现在模态窗口中:

    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.support.ui import WebDriverWait as wait
    
    driver.get('https://www.tripadvisor.com/Attraction_Review-g60776-d117416-Reviews-Colorado_National_Monument-Fruita_Colorado.html')
    wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//span[.="More languages"]'))).click()
    languages = [lang.text for lang in wait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, '//div[@class="more-options"]//label')))][1:]
    
    print(languages)
    
    ['English (1,463)', 'German (34)', 'French (24)', 'Italian (11)', 'Dutch (9)', 'Chinese (Sim.) (8)', 'Danish (5)', 'Portuguese (4)', 'Spanish (4)', 'Japanese (2)', 'Russian (2)', 'Polish (1)']
    

    【讨论】:

    • 你不知道我花了多长时间。
    • 查看您的答案,@Andersson,非常简洁,我喜欢您的方法。我可以从那条评论中学到很多东西。
    猜你喜欢
    • 1970-01-01
    • 2017-10-21
    • 2021-07-31
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-07-02
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多