【问题标题】:WebScrape -Getting the hrefWebScrape - 获取href
【发布时间】:2022-01-15 09:28:15
【问题描述】:
在此页面每一行的末尾,都有一个“查看海报”链接,其中包含一个 URL。
我在我的代码中提取的第一个,作为“ur”拉得很好
我不知道如何拉取视图海报网址。
rom selenium import webdriver
import time
import pandas as pd
driver = webdriver.Chrome()
import requests
from bs4 import BeautifulSoup
val=[]
absinfo=[]
sesinfo=[]
url = 'https://meetings.asco.org/meetings/2022-gastrointestinal-cancers-symposium/286/program-guide/search?q=&filters=%7B%22sessionType%22:%5B%7B%22key%22:%22Poster%20Session%22%7D%5D%7D'
res=requests.get(url)
soup=BeautifulSoup(res.content,'html.parser')
driver.get(url)
time.sleep(4)
productlist =driver.find_elements_by_xpath(".//div[@class='session-card']")
#times = soup.select('.time')
for b in productlist:
ur=b.find_element_by_css_selector('a').get_attribute('href')
【问题讨论】:
标签:
python
selenium
web-scraping
beautifulsoup
css-selectors
【解决方案1】:
如果您想使用selenium,请尝试关注xpath 来识别产品列表下的href 链接。
driver.get("https://meetings.asco.org/meetings/2022-gastrointestinal-cancers-symposium/286/program-guide/search?q=&filters=%7B%22sessionType%22:%5B%7B%22key%22:%22Poster%20Session%22%7D%5D%7D")
productlist =driver.find_elements_by_xpath(".//div[@class='session-card']")
for item in productlist:
print("Url 1 :" + item.find_element_by_xpath(".//span[@data-cy='sessionTitle']//a").get_attribute('href'))
print("View Poster :" + item.find_element_by_xpath(".//a[.//span[text()='View Posters']]").get_attribute('href'))
输出:
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14170
View Poster :https://meetings.asco.org/session/14170
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14145
View Poster :https://meetings.asco.org/session/14145
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14169?presentation=205955
View Poster :https://meetings.asco.org/session/14169
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14168
View Poster :https://meetings.asco.org/session/14168
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14450
View Poster :https://meetings.asco.org/session/14450
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14163
View Poster :https://meetings.asco.org/session/14163
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14449
View Poster :https://meetings.asco.org/session/14449
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14451
View Poster :https://meetings.asco.org/session/14451
Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14166
View Poster :https://meetings.asco.org/session/14166