【问题标题】:Parsing a Table from the following website从以下网站解析表格
【发布时间】:2018-04-26 23:56:11
【问题描述】:
【问题讨论】:
标签:
python-2.7
beautifulsoup
html-parsing
【解决方案1】:
如果您提供一些您尝试过的代码会更好。无论如何,此代码适用于 1 月 1 日的表。您也可以编写循环来提取其他日子的数据。
from urllib.request import urlopen
from bs4 import BeautifulSoup
url = "https://www.timeanddate.com/weather/india/kanpur/historic?
month=1&year=2016"
page = urlopen(url)
soup = BeautifulSoup(page, 'lxml')
Data = []
table = soup.find('table', attrs={'id':'wt-his'})
for tr in table.find('tbody').find_all('tr'):
dict = {}
dict['time'] = tr.find('th').text.strip()
all_td = tr.find_all('td')
dict['temp'] = all_td[1].text
dict['weather'] = all_td[2].text
dict['wind'] = all_td[3].text
arrow = all_td[4].text
if arrow == '↑':
dict['wind_dir'] = 'South to North'
else:
dict['wind_dir'] = 'North to South'
dict['humidity'] = all_td[5].text
dict['barometer'] = all_td[6].text
dict['visibility'] = all_td[7].text
Data.append(dict)
注意:为 wind_dir 逻辑添加其他情况