【问题标题】:Xpath Select textXpath 选择文本
【发布时间】:2014-01-12 22:23:59
【问题描述】:

我一直在玩 xpath,当我选择段落时我可以让它工作,但是这个文本似乎不起作用。

这里是html

<span id="favorite_count" style="display: block;">
    <span style="cursor:help; border-bottom: 1px dotted black;" title="Active members who have made you their favorite.  This number may change as new members join, or close their accounts.">My total number of <span class="favorites">:</span>
    </span> 
19458
</span>

我正在尝试选择19458

这是我的 xpath 代码

$favorites   = $data->xpath( '//span[@id="favorite_count"]/text()' );

注意:

我知道这与上面的这一行有关,因为当我使用

$favorites   = $data->xpath( '//span[@id="favorite_count"]/span' );

我得到My total number of的结果

我也无法更改 HTML,因为它来自我无权修改的页面。

【问题讨论】:

    标签: php html xml xpath


    【解决方案1】:

    您正在使用 SimpleXML 库。它无法使用SimpleXMLElement::xpath() 方法选择文本节点。

    要让它发挥作用,您需要扩展 SimpleXMLElement 并动态采用结果。借助它的姊妹库DOM,这很容易实现。

    示例代码:

    echo (new DOMXpath(
       dom_import_simplexml(
          simplexml_load_string($html)
       )->ownerDocument
    ))->evaluate('normalize-space(//span[@id="favorite_count"]/text()[last()])');
    

    程序输出:

    19458

    演示:https://eval.in/82605

    在您非常具体的情况下,您还可以直接使用 SimpleXML 执行以下操作:

    echo trim($xml->xpath('//span[@id="favorite_count"]')[0]);
    

    这是因为内部 &lt;span&gt; 确实隐藏了实际的节点值,它只返回空格、行分隔符和数字 19458。

    相关问题:

    【讨论】:

      【解决方案2】:

      HTML 代码:

       <html>
          <head></head>
          <body>
              <span id="favorite_count" style="display: block;">
                  <span style="cursor:help; border-bottom: 1px dotted black;" title="Active members who have made you their favorite. This number may change as new members join, or close their accounts.">My total number of <span class="favorites">:</span>
                  </span> 
                  19458
              </span>
          </body>
      </html>
      

      PHP 代码:

       /* Use internal libxml errors -- turn on in production, off for debugging */
      libxml_use_internal_errors(true);
      /* Createa a new DomDocument object */
      $dom = new DomDocument;
      /* Load the HTML */
      $dom->loadHTMLFile("test.html");
      /* Create a new XPath object */
      $xpath = new DomXPath($dom);
      /* Query all <td> nodes containing specified class name */
      $nodes = $xpath->query("//*[@id='favorite_count']/text()");
      /* Set HTTP response header to plain text for debugging output */
      header("Content-type: text/plain");
      /* Traverse the DOMNodeList object to output each DomNode's nodeValue */
      foreach ($nodes as $i => $node) {
          echo "Node($i): ", $node->nodeValue, "\n";
      }
      

      输出:

      节点(0):

      节点(1):
      19458

      【讨论】:

        猜你喜欢
        • 2020-12-18
        • 2011-06-29
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2011-05-30
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多