【发布时间】:2021-05-13 22:29:42
【问题描述】:
帮助
如果没有“文本”,我一直试图跳过错误,但我所做的任何尝试都没有奏效。所以我有点担心,我是这个代码世界的新手,但是当总是有“文本”要提取时,这段代码是有效的。
有人可以告诉我把条件放在哪里吗? :/
import requests
import pandas as pd
baseUrl = "https://www.sodimac.cl"
headers = {
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
}
productlinks =[]
for x in range(1,7):
r = requests.get(f"https://www.sodimac.cl/sodimac-cl/category/scat922339/Ceramicas?currentpage={x}")
soup = BeautifulSoup(r.content, 'lxml')
productlist = soup.find_all('div', class_='jsx-3663142191 search-results-products-container')
for item in productlist:
for link in item.find_all('a', id="title-pdp-link", href=True):
productlinks.append(baseUrl + link['href'])
Ceramicaslist = []
for link in productlinks:
r = requests.get(link)
soup = BeautifulSoup(r.content, 'lxml')
name = soup.find('div', class_='jsx-4129468047 product-brand').text.strip()
descripcion = soup.find('h1', class_='jsx-4129468047 product-title').text.strip()
modelo = soup.find('div', class_='jsx-4129468047 product-model').text.strip()
SKU = soup.find('div', class_='jsx-4129468047 product-cod').text.strip()
precio = soup.find('span', class_='jsx-3655512908').text.strip()
Ceramicas = {
'name': name,
'descripcion': descripcion,
'modelo': modelo,
'SKU': SKU,
'precio': precio
}
Ceramicaslist.append(Ceramicas)
print('Saving: ', Ceramicas['name'], Ceramicas['descripcion'], Ceramicas['modelo'],Ceramicas['SKU'], Ceramicas['precio'])
df = pd.DataFrame(Ceramicaslist)
print(df)
df.to_csv('CeramicasSodimac.csv')**strong text**```
【问题讨论】:
标签: python-3.x pandas web-scraping