【问题标题】:How to select this element with Scrapy XPATH?如何使用 Scrapy XPATH 选择这个元素?
【发布时间】:2020-06-20 21:42:00
【问题描述】:

唯一要求:需要引用thread-navigation类,因为该页面还有很多其他分页元素

<section id="thread-navigation" class="group">
<div class="float-left">
<div class="pagination talign-mleft">
<span class="pages">Pages (6):</span>
<span class="pagination_current">1</span>
<a href="I want this text?page=2" class="pagination_page">2</a>

<a href=""I want this text?page=3" class="pagination_page">3</a>
<a href=""I want this text?page=4" class="pagination_page">4</a>
<a href=""I want this text?page=5" class="pagination_page">5</a>
<a href=""I want this text?page=6" class="pagination_last">6</a>
<a href=""I want this text?page=2" class="pagination_next">Next &raquo;</a> //<--- this one
</div>
</div>
</section>

我正在尝试这样的事情: r.xpath('//*[@class="thread-navigation" and contains (., "Next")]').get() 但它总是返回None

谢谢

【问题讨论】:

  • 看起来你有引用问题:href="" 是错字吗?

标签: xpath scrapy web-crawler


【解决方案1】:

您指的不是@class 属性,而是一个值为thread-navigation@id 属性。所以试试这个 XPath-1.0 表达式:

r.xpath('//a[ancestor::*/@id="thread-navigation" and contains (text(), "Next")]/@href').get()

它的结果是

我想要这个文本?page=2

【讨论】:

【解决方案2】:

这个 xpath:

'//section[@id="thread-navigation"]//a/@href'

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2012-08-24
    • 2019-04-24
    • 1970-01-01
    • 2010-10-02
    • 1970-01-01
    • 1970-01-01
    • 2020-03-24
    相关资源
    最近更新 更多