如何从 python 列表中的元素中提取浮点数？答案

【问题标题】：How do you extract the floats from the elements in a python list?如何从 python 列表中的元素中提取浮点数？
【发布时间】：2017-01-19 03:02:54
【问题描述】：

我正在使用 BeautifulSoup4 构建一个进行财务计算的脚本。我已成功将数据提取到列表中，但只需要元素中的浮点数。

例如：

Volume = soup.find_all('td', {'class':'text-success'})

print (Volume)

这给了我以下列表输出：

[<td class="text-success">+1.3 LTC</td>, <td class="text- success">+5.49<span class="muteds">340788</span> LTC</td>, <td class="text-success">+1.3 LTC</td>,]

我希望它变成：

[1.3, 5.49, 1.3]

我该怎么做？

非常感谢您阅读我的帖子，非常感谢我能得到的任何帮助。

【问题讨论】：

stackoverflow.com/questions/4703390/…的可能重复
这个列表显然不是一个有效的python列表。你的意思是["<td ...>+1.3</td>", ...]？
@linusg 不是一个有效的 python 列表，但这就是 BeautifulSoup 的 ResultSet 字符串表示的样子。

标签： python regex beautifulsoup

【解决方案1】：

你可以的

>>> import re
>>> re.findall("\d+\.\d+", yourString)
['1.3', '5.49', '1.3']
>>>

然后转换为浮点数

>>> [float(x) for x in re.findall("\d+\.\d+", yourString)]
[1.3, 5.49, 1.3]
>>>

【讨论】：

【解决方案2】：

您可以在每个td 中找到第一个文本节点，将其按空格分割，获取第一项并通过float() 将其转换为float - + 将被自动处理：

from bs4 import BeautifulSoup

data = """
<table>
    <tr>
        <td class="text-success">+1.3 LTC</td>
        <td class="text-success">+5.49<span class="muteds">340788</span> LTC</td>
        <td class="text-success">+1.3 LTC</td>
    </tr>
</table>"""

soup = BeautifulSoup(data, "html.parser")

print([
    float(td.find(text=True).split(" ", 1)[0])
    for td in soup.find_all('td', {'class':'text-success'})
])

打印[1.3, 5.49, 1.3]。

注意find(text=True) 如何帮助避免在第二个td 中提取340788。

【讨论】：

谢谢你这么棒。这很好用！