【发布时间】:2014-11-23 03:48:08
【问题描述】:
我想提取下图中我提到的参数...
我试过的是:
url='http://site.ir'
content=requests.get(url).content
tree = html.fromstring(content)
print [e.text_content() for e in tree.xpath('//div[@class="grouptext"]/????')]
这不在标签 span 中,也不在标签 br 中。
图片:
更新
想象一下我有:
out=""" <div class="groupinfo">
<div class="grouptext">
<span style="color:#5f0101">
span tag contents
</span>
WHAT I WANT
<br></br>
</div>
</div> <div class="groupinfo">
<div class="grouptext">
<span style="color:#5f0101">
span tag contents
</span>
WHAT I WANT(1)
<br></br>
</div>
</div>
imagine I have: out=""" <div class="groupinfo">
<div class="grouptext">
<span style="color:#5f0101">
span tag contents
</span>
WHAT I WANT(2)
<br></br>
</div>
</div> <div class="groupinfo">
<div class="grouptext">
<span style="color:#5f0101">
span tag contents
</span>
WHAT I WANT(3)
<br></br>
</div>
</div> """"""
【问题讨论】:
标签: python html xpath html-parsing lxml