如何在使用 python 和 Beautiful soup 抓取的同时访问站点中的兄弟元素答案

【问题标题】：How to access sibling elements in a site while scraping using python and Beautiful soup如何在使用 python 和 Beautiful soup 抓取的同时访问站点中的兄弟元素
【发布时间】：2019-02-22 08:53:06
【问题描述】：

我正在尝试抓取该网站的“listing-key-specs”：

https://www.autotrader.co.uk/car-search?radius=30&postcode=ss156ee&onesearchad=Used&make=Renault&model=zoe&page=1

但我只对里程规格感兴趣，而不是 bhp 或任何其他规格。

如果我输入

specs=article.find('ul',class_="listing-key-specs")
print(specs.text)

我可能会得到 6 条信息：

2015 (65 reg)
Hatchback
13,033 miles
88bhp
Automatic
Electric**

如果我输入

print(specs.li.text)

我只会得到第一个规格，即

2015 年（65 注册）

如何选择特定的规格？比方说“英里”规格？

【问题讨论】：

标签： python web-scraping beautifulsoup

【解决方案1】：

可以提取第一个子li

from bs4 import BeautifulSoup as bs
import requests
res= requests.get('https://www.autotrader.co.uk/car-search?radius=30&postcode=ss156ee&onesearchad=Used&make=Renault&model=zoe&page=1')
soup = bs(res.content, 'lxml')
details = [item.text for item in soup.select('.listing-key-specs li:first-child')]
print(details)

效率较低是

.listing-key-specs li:nth-of-type(1)

或

.listing-key-specs :nth-child(1)

或

.listing-key-specs li:first-of-type

我正在使用最新的 BeautifulSoup 4.7.1

【讨论】：

【解决方案2】：

或者简单地说：

print(specs('li')[2].text)

输出：

15,285 miles

【讨论】：

谢谢！没想到它可以像列表一样被访问。