【发布时间】:2021-08-25 17:39:34
【问题描述】:
在我的项目中,我试图抓取 youtube 观众人数、评论人数、喜欢和不喜欢人数。我不能接受 cmets 号码,我尝试了不同的方法,但没有任何改变。这是我的代码,请帮助我:
import selenium
from selenium import webdriver
import pandas as pd
import time
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
#we choose our browser chromedriver must be in the path
driver = webdriver.Chrome()
#we need data to save variables
data = {'Likes' : [], 'Dislikes' : [], 'Comments' : [], 'Views' : []}
dataframe = pd.DataFrame(data)
# we get the link
driver.get("https://www.youtube.com/watch?v=fHI8X4OXluQ")
# we wait for opening the link
time.sleep(5)
# we find element by xpatch which means manually
Likes = driver.find_element_by_xpath('/html/body/ytd-app/div/ytd-page-manager/ytd-watch-
flexy/div[5]/div[1]/div/div[8]/div[2]/ytd-video-primary-info-
renderer/div/div/div[3]/div/ytdmenu-renderer/div[2]/ytd-toggle-button-renderer[1]/a/yt-
formatted-string').text
Dislikes = driver.find_element_by_xpath('/html/body/ytd-app/div/ytd-page-manager/ytd-watch-
flexy/div[5]/div[1]/div/div[8]/div[2]/ytd-video-primary-info-renderer/div/div/div[3]/div/ytd-
menu-renderer/div[2]/ytd-toggle-button-renderer[2]/a/yt-formatted-string').text
View = driver.find_elements_by_xpath('//div[@id="count"]')
Comments=driver.find_elements_by_xpath('/html/body/ytd-app/div/ytd-page-manager/ytd-watch-
flexy/div[5]/div[1]/div/ytd-comments/ytd-item-section-renderer/div[1]/ytd-comments-header-
renderer/div[1]/h2/yt-formatted-string/span[1]')
print(Likes)
print(Dislikes)
print(View[1].text)
print(Comments)
driver.quit()
【问题讨论】:
-
编写相对 xpath 总是一个好习惯
标签: python selenium web-scraping youtube