【问题标题】:Python BeautifulSoup: Iterating over a tablePython BeautifulSoup:迭代表
【发布时间】:2014-10-02 10:48:59
【问题描述】:
我想遍历每个 TR 标签的每个 TD。因此,例如,如果我得到所有表:
trList = tbody.findAll('tr')
稍后我想分别获取每个TR元素的所有TD标签。
类似:
trList[0]:
td[0]
td[1] # I wanted to get this TD of every TR
td[2]
trList[1]:
td[0]
td[1] # this one as well
td[2]
在正常情况下,我会使用嵌套循环来获取它。
有可能吗?
【问题讨论】:
标签:
python
beautifulsoup
html-table
【解决方案1】:
可以,使用相同的功能findAll
trList = tbody.findAll('tr')
for tr in trList:
tdList = tr.findAll('td')
for td in tdList:
// here you got each td
【解决方案2】:
nth-of-type CSS selector 在这里会有所帮助:
from bs4 import BeautifulSoup
data = """
<table>
<tr>
<td>1</td>
<td>2</td>
<td>3</td>
</tr>
<tr>
<td>4</td>
<td>5</td>
<td>6</td>
</tr>
<tr>
<td>7</td>
<td>8</td>
<td>9</td>
</tr>
</table>
"""
soup = BeautifulSoup(data)
for td in soup.select('table > tr > td:nth-of-type(2)'):
print td.text
打印:
2
5
8