【发布时间】:2022-01-21 11:24:21
【问题描述】:
我试图用 selenium 废弃此页面上的搜索结果元素:https://shop.bodybuilding.com/search?q=protein+bar&selected_tab=Products,但结果它只给了我前 4 个元素。 我不确定为什么?这是一个javascript页面?以及如何删除此搜索页面上的所有元素? 这是我创建的代码:
import requests
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome(executable_path='C:/chromedriver')
url = 'https://shop.bodybuilding.com/search?q=protein+bar&selected_tab=Products'
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')
all_items = soup.find_all('div', {'class': 'ProductTile ProductTile--flat Animate AnimateOnHover Animate--fade-in Animate--animated'})
for i in range(len(all_items)):
prices=all_items[i].find('div', {'class': 'Price ProductTile__price'}).text
names=all_items[i].find('p', {'class': 'ProductTile__title'}).text
images=all_items[i].find('img')['src']
url=all_items[i].find('a', {'class': 'Anchor ProductTile__image'})['href']
print(images)
这是此页面上名称的结果,如您所见,它仅抓取前 4 个元素!
BSN Protein Crisp Bars
Optimum Nutrition Protein Wafers
Herbaland Vegan Protein Gummies
Battle Bars Full Battle Rattle (FBR) Protein Bar
价格、图片和网址都一样吗?
【问题讨论】:
标签: python selenium web-scraping beautifulsoup