【发布时间】:2012-03-28 13:26:38
【问题描述】:
我有一个如下所示的 HTML。我想获取<span class="zzAggregateRatingStat"> 中的文本。根据下面给出的例子,我会得到 3 和 5。
对于这项工作,我使用的是 Python2.7 和 lxml
<div class="pp-meta-review">
<span class="zrvwidget" style="">
<span g:inline="true" g:type="NumUsersFoundThisHelpful" g:hideonnoratings="true" g:entity.annotation.groups="maps" g:entity.annotation.id="http://maps.google.com/?q=Central+Kia+of+Irving++(972)+659-2204+loc:+1600+East+Airport+Freeway,+Irving,+TX+75062&gl=US&sll=32.83624,-96.92526" g:entity.annotation.author="AIe9_BH8MR-1JD_4BhwsKrGCazUyU5siqCtjchckDcg5BAl5rOLd9nvhJJDTrtjL-xFI8D42bD_7">
<span class="zzNumUsersFoundThisHelpfulActive" zzlabel="helpful">
<span>
<span class="zzAggregateRatingStat">3</span>
</span>
<span>
<span> </span>
out of
<span> </span>
</span>
<span>
<span class="zzAggregateRatingStat">5</span>
</span>
<span>
<span> </span>
people found this review helpful.
</span>
</span>
</span>
</span>
</div>
【问题讨论】:
-
获取.
-
...并通过展示您尝试过的内容来完成问题。
-
我真的很抱歉错字。 Stackoverflow 将其作为 HTML 标记
标签: python web-scraping lxml python-2.7