【发布时间】:2021-01-03 15:55:34
【问题描述】:
我正在尝试使用以下代码收集网站上显示的多个链接的地址:
from selenium import webdriver
import time
from bs4 import BeautifulSoup
import urllib.request
driver = webdriver.Chrome(executable_path='C:/Users/seongwoo/Desktop/USHL data scraping/chromedriver.exe')
url = ("https://www.ushl.com/view#/schedule")
driver.get(url)
driver.find_element_by_xpath("//select[@ng-model ='selectedSeason']/option[@label='2018-19']").click()
time.sleep(3)
driver.find_element_by_xpath("//select[@ng-model ='selectedTeam']/option[@label='Youngstown Phantoms']").click()
time.sleep(3)
driver.find_element_by_xpath("//select[@ng-model ='selectedMonth']/option[@Value='12']").click()
time.sleep(3)
driver.find_element_by_xpath("//a[@ng-click=\"location='home';\"]").click()
time.sleep(3)
driver.find_element_by_xpath('//a[@class="ht-btn-submit ng-binding"]').click()
time.sleep(10)
window_before = driver.window_handles[0] #store the monther window's handle
buttons = driver.find_elements_by_class_name('ht-table-game-report') #use this instead of 'by_xpath'
for button_index in range(len(buttons)):
time.sleep(3)
buttons[button_index].click() ##this is where you decide which of the reports to click on
#after clicking the link store the window handle of newly opened window as
window_after = driver.window_handles[1]
#then execute the switch to window method to move to newly opened window
driver.switch_to.window(window_after)
current_URL = driver.current_url #Hthis does not seem to update the address
print(current_URL)
webUrl = urllib.request.urlopen(current_URL)
driver.switch_to.window(window_before)
我会想到
driver.switch_to.window(window_after)
current_URL = driver.current_url
点击链接后会更新地址。
如果有人能指出为什么current_URL 永远停留在第一个更新的地址并且之后无法更新,我将不胜感激。
【问题讨论】:
标签: python selenium web-scraping webdriver