【发布时间】:2021-10-22 10:20:59
【问题描述】:
我正在运行一些简单的网络抓取教程 但我觉得很难前进。
特别是,'title' 是唯一从中提取文本的元素之一。 对于剩余的“价格”和“状态”,它总是给我同样的错误。
AttributeError: 'NoneType' object has no attribute 'text'
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.ebay.it/sch/i.html?_from=R40&_trksid=p2380057.m570.l1313&_nkw=monitor&_sacat=0'
def get_data(url):
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
return soup
def parse(soup):
productlist = []
results = soup.find_all('div', {'class' : 's-item__info clearfix'})
for item in results:
product = {
'title': item.find('h3', {'class': 's-item__title'}).text,
'price': float(item.find('span', {'class': 's-item__price'})text.replace('EUR','').strip()),
'status': item.find('span',{'class':'SECONDARY_INFO'})text,
}
productlist.append(product)
return productlist
def output(productlist):
productsdf = pd.DataFrame(productlist)
productsdf.to_csv('output.csv', index = False)
print('Saved to CSV')
return productsdf
soup = get_data(url)
productlist =parse(soup)
ug = output(productlist)
感谢任何想帮助我的人
【问题讨论】:
标签: python-3.x web-scraping beautifulsoup