【发布时间】:2020-10-07 19:36:35
【问题描述】:
我正在开发一个网络爬虫,正如您所知,一切都已得到保护,所以我正在研究这个 硒驱动程序 这不起作用
from selenium import webdriver
import pandas as pd
import bs4
products = []
prices = []
orginalPrice =[]
sizes = []
open('product.csv','w')
driver = webdriver.Chrome("/home/arcot/Documents/chromedriver")
driver.get("https://www.myntra.com/bra")
content = driver.page_source
soup = bs4.BeautifulSoup(content,features="lxml")
for a in soup.find('li', attrs={'class':'product-base'}):
productName = a.find('h3', attrs={'class':'product-product'})
productBrand = a.find('h4', attrs={'class':'product-brand'})
size = a.find('button', attrs={'class':'product-sizeButton'})
productPrice = a.find('span', attrs={'class':'product-discountedPrice'})
OrginalPrice = a.find('span', attrs={'class':'product-strike'})
name=(str(productBrand)+" "+str(productName))
products.append(name)
prices.append(str(productPrice))
orginalPrice.append(str(OrginalPrice))
data = {'ProductName':products,'Price':prices,'orginalPrice':orginalPrice,'Sizes':sizes}
df = pd.DataFrame.from_dict(data, orient='index')
df.to_csv('product.csv', index=True, encoding='utf-8')
我尝试调试,但我不知道为什么它不来但价格来了,但我尝试过其他产品不来。谁能帮帮我?
【问题讨论】:
标签: python-3.x pandas selenium web-scraping