等待页面上不可见的元素答案

【问题标题】：Waiting for invisible elements not on the page等待页面上不可见的元素
【发布时间】：2017-12-20 13:54:05
【问题描述】：

我正在尝试通过关注script 删除此网页。

我不能等待这个元素，它没有正确抓取。

clickMe = wait(driver, 3).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ('//a[@class='style-scope match-pop-market']'))))

Chrome 检查中的元素是正确的。

//a[@class='style-scope match-pop-market']

如何获取当前页面 elem_href 而不是其他页面上似乎出现的其他元素不可见。

//div[@class='mpm_match_title' and .//div[@class='mpm_match_title style-scope match-pop-market']]//a[@class='style-scope match-pop-market']

虽然理论上应该可以解决此问题，但不起作用。有任何想法吗？当前输出：

None
None
None
None
None
None
None
None
None
None
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6381070
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386987
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386988
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386989
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386990
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386991
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386992
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387025
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387026
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387027
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387028

无法等待元素，因为它想等待当前页面上不可见的元素。

所以：

//div[contains(@class, 'mpm_match_title')] #TEXT
//div[contains(@class, 'mpm_match_title style-scope match-pop-market')]  #BAR
//a[contains(@class, 'style-scope match-pop-market')] #HREF
style-scope match-pop-market

综合：

//div[contains(@class, 'mpm_match_title') and .//div[contains(@class, 'mpm_match_title style-scope match-pop-market')]//a[@class='style-scope match-pop-market']

找不到。

期望的输出：

https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6381070
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386987
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386988
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386989
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386990
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386991
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6386992
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387025
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387026
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387027
https://www.palmerbet.com/sports/soccer/italy-serie-b/match/6387028

【问题讨论】：

标签： python css python-3.x selenium xpath

【解决方案1】：

使用来自 cmets 中 pastebin 链接的代码，我基本上只是修改了 Xpath 以搜索可以识别当前页面上的链接的特定元素。

from random import shuffle

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait

driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()
driver.get('https://www.palmerbet.com/sports/soccer')

clickMe = wait(driver, 3).until(EC.element_to_be_clickable((By.XPATH, 
    ('//*[contains(@class,"filter_labe")]'))))
options = driver.find_elements_by_xpath('//*[contains(@class,"filter_labe")]')

indexes = [index for index in range(len(options))]
shuffle(indexes)

xp = '//sport-match-grp[not(contains(@style, "display: none;"))]' \
    '//match-pop-market[@class="sport-match-grp" and ' \
    'not(contains(@style, "display: none;")) and ' \
    './/a[@id="match_link" and boolean(@href)]]'

for index in indexes:
    print(f'Loading index {index}')
    driver.get('https://www.palmerbet.com/sports/soccer')
    clickMe1 = wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH,
        '(//ul[@id="tournaments"]//li//input)[%s]' % str(index + 1))))
    driver.execute_script("arguments[0].scrollIntoView();", clickMe1)
    clickMe1.click()

    try:
        # this attempts to find any links on the page
        clickMe = wait(driver, 3).until(EC.element_to_be_clickable((
            By.XPATH, xp)))
        elems = driver.find_elements_by_xpath(xp)

        elem_href = []
        for elem in elems:
            print(elem.find_element_by_xpath('.//a[@id="match_link"]')
                .get_attribute('href'))
            elem_href.append(elem.get_attribute("href"))
    except:
        print(f'There are no matches in index {index}.')

【讨论】：

单击页面时等待时间不起作用。参见：@Line28 clickMe = wait(driver, 3).until(EC.element_to_be_clickable((By.XPATH, ("//a[@class='style-scope match-pop-market']")))) @ 987654321@。这会产生更好的输出，但无法输出，因为它等待页面上不可见的元素。
请注意，它适用于单页和第一次点击。在那之后，nada。
感谢您向我展示代码，我对原始请求感到有些困惑。更新我的回复以反映新的答案。
另外，只是一个头像，英格兰足总杯和世界杯页面不包含链接。当等待失败时，添加 try / except 来处理它们。
你是个传奇！ :)。这真的很好。每个页面上出现的#match_title（团队名称）的 xp 选择器是什么。这意味着所有数据都被抓取，而不是一些丢失的页面。