【问题标题】:How to retreive the date when video has been released on Youtube using Python and Selenium如何使用 Python 和 Selenium 检索视频在 Youtube 上发布的日期
【发布时间】:2022-01-03 15:16:25
【问题描述】:

我用 python 和 selenium 编写了一个脚本,可以在 youtube 上进行搜索。完全加载后,我只能从结果中获取所有标题。我可以集成任何代码行以获取发布日期吗?

这是我的代码:

def youTube():
    term = 'bitcoin'
    tit = []
    
    d = webdriver.Firefox()
    d.get('https://www.youtube.com/results?search_query='+term+'&sp=CAISAhAB')
    sleep(3)
    
    d.find_element_by_xpath("//*[contains(text(), 'Accetto')]").click()
    sleep(2)
    
    scrollHeight = d.execute_script("return window.scrollMaxY")
    print(scrollHeight)
    scrolled_pages = 0
    # while we have not reached the max scrollHeight
    while d.execute_script("return window.pageYOffset") < 3000:
        d.execute_script("window.scrollByPages(1)")
        scrolled_pages += 1
        sleep(0.2)
    for my_elem in WebDriverWait(d, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//yt-formatted-string[@class='style-scope ytd-video-renderer' and @aria-label]"))):
        tit.append(my_elem.text)

【问题讨论】:

    标签: selenium selenium-webdriver xpath youtube webdriverwait


    【解决方案1】:

    要使用 PythonSelenium 检索视频在 Youtube 上发布的日期/时间,您可以使用以下 Locator Strategy

    • 代码块:

      driver.get("https://www.youtube.com/results?search_query=%27+term+%27&sp=CAISAhAB")
      print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@id='metadata-line' and @class='style-scope ytd-video-meta-block']////following::span[2][contains(., 'hours') or contains(., 'day')]")))])
      
    • 控制台输出:

      ['2 hours ago', '2 hours ago', '3 hours ago', 'Streamed 3 hours ago', 'Streamed 3 hours ago', '5 hours ago', 'Streamed 6 hours ago', '6 hours ago', '6 hours ago', '7 hours ago', '8 hours ago', 'Streamed 8 hours ago', '9 hours ago', 'Streamed 10 hours ago', '11 hours ago']
      

    【讨论】:

    • 似乎正确,但如果我尝试执行,我会收到此错误: dates = WebDriverWait(d, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@id=' metadata-line' 和 @class='style-scope ytd-video-meta-block']////following::span[2][contains(., 'hours') 或 contains(., 'day') ]"))) 文件“C:\Users\Surface\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\support\wait.py”,第 80 行,直到引发 TimeoutException (消息,屏幕,堆栈跟踪)selenium.common.exceptions.TimeoutException:消息:
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-01-02
    • 2020-04-24
    • 2013-07-02
    • 1970-01-01
    • 2021-09-09
    相关资源
    最近更新 更多