【发布时间】:2016-04-15 03:18:24
【问题描述】:
我有一个从 HTML 页面中提取数据的 python 脚本。在 HTML 中有一个带有浮动的表格。当我执行findAll 和赋值语句时,它会将其分配给 unicode 类型的变量。但我需要变量为浮点类型。我以为我可以做一个float() 转换,但这会引发错误:
File "percentages.py", line 52, in <module>
top_score = float(row.findAll('td')[2].text.strip())
ValueError: invalid literal for float(): 62.4%
sn-p 代码如下:
for link in stat_links:
r = requests.get(link)
soup = BeautifulSoup(r.text, "html.parser")
table = soup.find('table', class_="tr-table datatable scrollable")
team_rows = table.findAll('tr')
team_rows = team_rows[1:]
for row in team_rows:
if row.findAll('td')[0].text.strip() == '1':
top_score = float(row.findAll('td')[2].text.strip())
if row.findAll('td')[0].text.strip() == '351':
lowest_score = float(row.findAll('td')[2].text.strip())
for row in team_rows:
if row.findAll('td')[1].text.strip() == sys.argv[1]:
temp = float(row.findAll('td')[2].text.strip())
if link == "https://www.teamrankings.com/ncaa-basketball/stat/average-scoring-margin":
top_score = top_score + abs(lowest_score)
temp = temp + abs(lowest_score)
lowest_score = 0
temp = (temp - lowest_score) / (top_score - lowest_score)
team_one = team_one + temp
【问题讨论】:
标签: python html unicode beautifulsoup extract