【发布时间】:2022-08-17 18:16:15
【问题描述】:
我想从这个网页上刮价格
我首先开始使用 Beautifulsoup,然后我切换到 Selenium,因为它是动态数据,我终于被阻止了:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
url = \"https://cnft.tools/claynation\"
s = Service(r\"C:\\Users\\marti\\Downloads\\chromedriver_win32\\chromedriver.exe\")
driver = webdriver.Chrome(service=s)
driver.get(url)
pieces = driver.find_element(By.CLASS_NAME, \'[name=\"card cardtools zoom card-group\"]\')
# driver.find_elements(By.XPATH(\'//*[@id=\"__next\"]/div/main/div/div[2]/div/div[2]/div/div[3]/div/div[1]/div[1]/button/div/div[2]/div[1]/div[1]/span/div\'))
# pieces
# for piece in pieces:
# rank = piece.find_element(\"xpath\", \'.//*[@id=\"__next\"]/div/main/div/div[2]/div/div[2]/div/div[3]/div/div[1]/div[1]/button/div/div[2]/div[1]/div[1]/span/div\').text
# price = piece.find_element(\"xpath\",\'.//*[@id=\"__next\"]/div/main/div/div[2]/div/div[2]/div/div[3]/div/div[1]/div[1]/button/div/div[2]/div[3]/div/div\').text
# print(rank, price)
并返回这个
InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified
(Session info: chrome=104.0.5112.102)
Stacktrace:
Backtrace:
Ordinal0 [0x00B378B3+2193587]
Ordinal0 [0x00AD0681+1771137]
Ordinal0 [0x009E41A8+803240]
Ordinal0 [0x009E6BB4+814004]
Ordinal0 [0x009E6A72+813682]
Ordinal0 [0x009E6D00+814336]
Ordinal0 [0x00A121B5+991669]
Ordinal0 [0x00A1273B+993083]
Ordinal0 [0x00A3F7C2+1177538]
Ordinal0 [0x00A2D7F4+1103860]
Ordinal0 [0x00A3DAE2+1170146]
Ordinal0 [0x00A2D5C6+1103302]
Ordinal0 [0x00A077E0+948192]
Ordinal0 [0x00A086E6+952038]
GetHandleVerifier [0x00DE0CB2+2738370]
GetHandleVerifier [0x00DD21B8+2678216]
GetHandleVerifier [0x00BC17AA+512954]
GetHandleVerifier [0x00BC0856+509030]
Ordinal0 [0x00AD743B+1799227]
Ordinal0 [0x00ADBB68+1817448]
Ordinal0 [0x00ADBC55+1817685]
Ordinal0 [0x00AE5230+1856048]
BaseThreadInitThunk [0x765B6739+25]
RtlGetFullPathName_UEx [0x777590AF+1215]
RtlGetFullPathName_UEx [0x7775907D+1165]
也许是因为授权?
标签: python selenium web-scraping beautifulsoup