【发布时间】:2019-10-24 04:32:46
【问题描述】:
我有下面的html代码
<div class = "matches">
<div class = "conf">
Brazil vs. Colombia
</ div>
<div class = "targetHour"> 08:00 pm </ div>
</ div>
</ div>
<div class = "matches">
<div class = "conf">
Chilex Argentina
</ div>
<div class = "targetHour"> 08:00 pm </ div>
</ div>
</ div>
我需要获取父div的值和子div的值,不重复结果。将每场比赛的时间表与各自的家长联系起来。
这是我的python代码
for nc in soup.find_all('div', attrs={'class': 'league-data'}):
campeonato = nc.text
for hr in soup.find('div', attrs={'class': 'match row cf'}).findAll("div",recursive=False):
print(campeonato + "|" + hr.text)
【问题讨论】:
-
为什么不列出已经收集的项目并从中过滤?
标签: python python-3.x beautifulsoup selenium-chromedriver