【问题标题】:"An invalid or illegal selector was specified" when scraping dynamic content抓取动态内容时“指定了无效或非法的选择器”
【发布时间】:2022-08-17 18:16:15
【问题描述】:

我想从这个网页上刮价格

我首先开始使用 Beautifulsoup,然后我切换到 Selenium,因为它是动态数据,我终于被阻止了:

from selenium import webdriver 
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
url = \"https://cnft.tools/claynation\"

s = Service(r\"C:\\Users\\marti\\Downloads\\chromedriver_win32\\chromedriver.exe\")
driver = webdriver.Chrome(service=s) 

driver.get(url)

pieces = driver.find_element(By.CLASS_NAME, \'[name=\"card cardtools zoom card-group\"]\')
# driver.find_elements(By.XPATH(\'//*[@id=\"__next\"]/div/main/div/div[2]/div/div[2]/div/div[3]/div/div[1]/div[1]/button/div/div[2]/div[1]/div[1]/span/div\'))

# pieces

# for piece in pieces:
#     rank = piece.find_element(\"xpath\", \'.//*[@id=\"__next\"]/div/main/div/div[2]/div/div[2]/div/div[3]/div/div[1]/div[1]/button/div/div[2]/div[1]/div[1]/span/div\').text
#     price = piece.find_element(\"xpath\",\'.//*[@id=\"__next\"]/div/main/div/div[2]/div/div[2]/div/div[3]/div/div[1]/div[1]/button/div/div[2]/div[3]/div/div\').text
#     print(rank, price)

并返回这个

InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified
  (Session info: chrome=104.0.5112.102)
Stacktrace:
Backtrace:
    Ordinal0 [0x00B378B3+2193587]
    Ordinal0 [0x00AD0681+1771137]
    Ordinal0 [0x009E41A8+803240]
    Ordinal0 [0x009E6BB4+814004]
    Ordinal0 [0x009E6A72+813682]
    Ordinal0 [0x009E6D00+814336]
    Ordinal0 [0x00A121B5+991669]
    Ordinal0 [0x00A1273B+993083]
    Ordinal0 [0x00A3F7C2+1177538]
    Ordinal0 [0x00A2D7F4+1103860]
    Ordinal0 [0x00A3DAE2+1170146]
    Ordinal0 [0x00A2D5C6+1103302]
    Ordinal0 [0x00A077E0+948192]
    Ordinal0 [0x00A086E6+952038]
    GetHandleVerifier [0x00DE0CB2+2738370]
    GetHandleVerifier [0x00DD21B8+2678216]
    GetHandleVerifier [0x00BC17AA+512954]
    GetHandleVerifier [0x00BC0856+509030]
    Ordinal0 [0x00AD743B+1799227]
    Ordinal0 [0x00ADBB68+1817448]
    Ordinal0 [0x00ADBC55+1817685]
    Ordinal0 [0x00AE5230+1856048]
    BaseThreadInitThunk [0x765B6739+25]
    RtlGetFullPathName_UEx [0x777590AF+1215]
    RtlGetFullPathName_UEx [0x7775907D+1165]

也许是因为授权?

    标签: python selenium web-scraping beautifulsoup


    【解决方案1】:

    不,这不是因为授权。错误消息非常简单:您的选择器无效,因为您将 CSS-selector 传递给按类名进行搜索。

    你需要更换

    driver.find_element(By.CLASS_NAME, '[name="card cardtools zoom card-group"]')
    

    driver.find_element(By.CSS_SELECTOR, '[class="card cardtools zoom card-group"]')
    

    或者

    driver.find_element(By.CSS_SELECTOR, '.card.cardtools.zoom.card-group')
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2019-06-17
      • 1970-01-01
      • 2020-11-11
      • 1970-01-01
      • 1970-01-01
      • 2021-09-20
      • 2015-06-18
      相关资源
      最近更新 更多