【发布时间】:2022-01-19 15:21:52
【问题描述】:
我正在使用 Python 进行 webscparing,但必须滚动页面才能加载所有内容,所以我使用 selenium。 我可以让第一部分工作,以便网络驱动程序启动,按下接受 cookie 并滚动 x 次(在上面的代码中是 2 次,因为我必须等待 5 分钟才能获得一个空白列表 T_T)
from msilib.schema import Class
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
url = "https://www.aboutyou.es/c/hombre/zapatos-20215"
opt = webdriver.ChromeOptions()
opt.add_argument("start-maximized")
driver = webdriver.Chrome(options = opt)
driver.get(url)
cookies = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '//button[@id="onetrust-accept-btn-handler"]'))
).click()
#element = driver.find_element(By.XPATH, '//button[@data.testid="loadMoreButton_100"]')
html = driver.find_element(By.TAG_NAME, 'html')
intentos = 2
try:
mas = WebDriverWait(driver, 15).until(
EC.presence_of_element_located((By.XPATH, '//button[@data-testid="loadMoreButton_100"]'))
).click()
except:
html.send_keys(Keys.PAGE_DOWN)
html.send_keys(Keys.PAGE_DOWN)
html.send_keys(Keys.PAGE_DOWN)
for i in range(intentos):
try:
mas = WebDriverWait(driver, 1).until(
EC.presence_of_element_located((By.XPATH, '//button[@data-testid="loadMoreButton_100"]'))
).click()
except:
html.send_keys(Keys.PAGE_DOWN)
html.send_keys(Keys.PAGE_DOWN)
html.send_keys(Keys.PAGE_DOWN)
html.send_keys(Keys.PAGE_DOWN)
if i < intentos - 1:
continue
grid_grande = driver.find_elements(By.XPATH,'//a[class="sc-16ol3xi-0 sc-163x4qs-0 fybchu loqbdm sc-nlxe42-2 fwTCrr"]')
print(grid_grande)
我要选择的元素是包含所有其他数据的网格,但我只得到一个空白列表 []:
<a data-testid="productTile-4218512" style="--product-tile-contents-height:112px" class="sc-16ol3xi-0 sc-163x4qs-0 fybchu loqbdm sc-nlxe42-2 fwTCrr" href="/p/panama-jack/botas-con-cordones-4218512"><div data-testid="productImage" class="sc-mt3y39-0 iYaafh">
<img height="100%" width="100%" decoding="async" importance="auto" loading="lazy" sizes="(max-width: 767px) calc(100vw / 3), calc(100vw / 4)" srcset="https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&quality=75&trim=1&height=160&width=120 120w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&quality=75&trim=1&height=480&width=360 360w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&quality=75&trim=1&height=534&width=400 400w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&quality=75&trim=1&height=800&width=600 600w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&quality=75&trim=1&height=1067&width=800 800w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&quality=75&trim=1&height=1280&width=960 960w" style="border-radius:2px" alt="PANAMA JACK - Botas con cordones en marrón: frente" data-testid="productImageView" class="sc-1876d5f-0-Component giShmP">
<div class="sc-1i699m5-0 eHwLkT"><div data-testid="badge-GENERIC" class="sc-1dqvaay-1 cHhMjt">Más sostenible</div></div>
</div><button type="button" data-testid="wishListButton" class="sc-1yegbck-0 cFfJJS sc-122ag38-0 eHyXBK sc-1ytk4ze-1 jrOzwu sc-1cy39j4-0 eCBNan"><svg class="sc-vu2m91-0 cXGjqJ sc-1ytk4ze-0 ebHRsM" data-testid="WishListIcon"><use xlink:href="#/assets/media/ic-heart.e31e11e8.svg"></use></svg><div class="sc-122ag38-1 ixqHjB"></div></button><div class="sc-nlxe42-0 kRHZwU"><div class="sc-1qsfqrd-0 xHpAu"><p data-testid="brandName" class="sc-1vt6vwe-0 sc-1vt6vwe-2 sc-1qsfqrd-1 dmJKga cyVcre gtGpeQ">PANAMA JACK</p><div class="sc-18q4lz4-2 cySBlJ sc-1qsfqrd-6 khWqDb" data-testid="priceBox"><span data-testid="finalPrice" class="sc-2qclq4-0 sc-18q4lz4-0 ePNAqF fbtbBY">169,00 €</span></div><div class="sc-1qsfqrd-7 eUQMHN"><ul data-testid="ColorContainer" class="sc-1qsfqrd-3 eSoPTy">
<li data-testid="ColorBubble-simple-#663300" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS dhRoGM sc-1qsfqrd-8 dYSOSZ"></li><li data-testid="ColorBubble-simple-#000000" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS gmeSfI sc-1qsfqrd-8 dYSOSZ"></li><li data-testid="ColorBubble-simple-#663300" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS dhRoGM sc-1qsfqrd-8 dYSOSZ"></li><li data-testid="ColorBubble-simple-#4c2002" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS hFwoRv sc-1qsfqrd-8 dYSOSZ"></li><li class="sc-1qsfqrd-4 glNrlz">+<!-- -->2</li></ul><span data-testid="Sizes" class="sc-1qsfqrd-5 gZDHxk">Disponible en muchas tallas</span></div></div></div></a>
【问题讨论】:
-
不看实际网站很难回答。
标签: python selenium selenium-chromedriver