【问题标题】:WebScrape -Getting the hrefWebScrape - 获取href
【发布时间】:2022-01-15 09:28:15
【问题描述】:

在此页面每一行的末尾,都有一个“查看海报”链接,其中包含一个 URL。

我在我的代码中提取的第一个,作为“ur”拉得很好

我不知道如何拉取视图海报网址。

rom selenium import webdriver
import time
import pandas as pd
driver = webdriver.Chrome()
import requests
from bs4 import BeautifulSoup

val=[]

absinfo=[]
sesinfo=[]

url = 'https://meetings.asco.org/meetings/2022-gastrointestinal-cancers-symposium/286/program-guide/search?q=&filters=%7B%22sessionType%22:%5B%7B%22key%22:%22Poster%20Session%22%7D%5D%7D'
res=requests.get(url)
soup=BeautifulSoup(res.content,'html.parser')


driver.get(url)
time.sleep(4)


productlist =driver.find_elements_by_xpath(".//div[@class='session-card']")
#times = soup.select('.time')

for b in productlist:
    ur=b.find_element_by_css_selector('a').get_attribute('href')

【问题讨论】:

    标签: python selenium web-scraping beautifulsoup css-selectors


    【解决方案1】:

    如果您想使用selenium,请尝试关注xpath 来识别产品列表下的href 链接。

    driver.get("https://meetings.asco.org/meetings/2022-gastrointestinal-cancers-symposium/286/program-guide/search?q=&filters=%7B%22sessionType%22:%5B%7B%22key%22:%22Poster%20Session%22%7D%5D%7D")
    
    productlist =driver.find_elements_by_xpath(".//div[@class='session-card']")
    
    for item in productlist:
         print("Url 1 :" + item.find_element_by_xpath(".//span[@data-cy='sessionTitle']//a").get_attribute('href'))
         print("View Poster :" + item.find_element_by_xpath(".//a[.//span[text()='View Posters']]").get_attribute('href'))
    

    输出:

    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14170
    View Poster :https://meetings.asco.org/session/14170
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14145
    View Poster :https://meetings.asco.org/session/14145
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14169?presentation=205955
    View Poster :https://meetings.asco.org/session/14169
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14168
    View Poster :https://meetings.asco.org/session/14168
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14450
    View Poster :https://meetings.asco.org/session/14450
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14163
    View Poster :https://meetings.asco.org/session/14163
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14449
    View Poster :https://meetings.asco.org/session/14449
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14451
    View Poster :https://meetings.asco.org/session/14451
    Url 1 :https://meetings.asco.org/2022-asco-gastrointestinal-cancers-symposium/14166
    View Poster :https://meetings.asco.org/session/14166
    

    【讨论】:

      猜你喜欢
      • 2023-03-13
      • 1970-01-01
      • 2015-09-30
      • 1970-01-01
      • 2014-08-05
      • 2016-06-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多