用美汤刮桌子答案

【问题标题】：scraping tables with beautifulsoup用美汤刮桌子
【发布时间】：2014-03-05 10:34:48
【问题描述】：

我似乎被卡住了，如果我有下表：

<table align=center cellpadding=3 cellspacing=0 border=1>
<tr bgcolor="#EEEEFF">
   <td align="center">
   40   </td>
   <td align="center">
   44   </td>
   <td align="center">
   <font color="green"><b>+4</b></font>
   </td>
   <td align="center">
   1,000</td>
   <td align="center">
   15,000   </td>
   <td align="center">
   44,000   </td>
   <td align="center">
   <font color="green"><b><nobr>+193.33%</nobr></b></font>
   </td>

</tr>

使用 find_all 从表中提取 44,000 td 的理想方法是什么？

【问题讨论】：

44,000 表格单元格与其他单元格的区别是什么？为什么会有这个特定值？

标签： python beautifulsoup

【解决方案1】：

如果它是您想要抓取的表格的重复位置，我会使用漂亮的汤来提取表格中的所有元素，然后提取该数据。请参阅下面的伪代码。

known_position = 5
tds = bs4.find_all('td')
number = tds[known_position].text()

另一方面，如果您专门搜索给定的数字，我只会遍历列表。

tds = bs4.find_all('td')
for td in tds:
    if td.text = 'number here':
        # do your stuff

【讨论】：