使用 xpath 获取 href答案

【问题标题】：Get href using xpath使用 xpath 获取 href
【发布时间】：2016-02-20 13:31:44
【问题描述】：

我正在尝试使用 xpath 提取 2 位数据

文本节点值和
超链接。

这是我的代码：

<?php
$curl = curl_init('http://www.livescore.com/soccer/england/league-2/');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10');
$html = curl_exec($curl);
curl_close($curl);
if (!$html) 
    {
    die("something's wrong!");
    }

$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$result = $xpath->query("/html/body/div[2]/div[5]/div[contains(@class, 'row')]");

var_dump ($result);
foreach($result as $row)
    {   

    $text = $row->nodeValue;
    $href = $row->getAttribute("href");

    //getAttribute("href")

    $array[] = array
        (
        'text' => trim($text),
        'href' => $href
        );

    }
    print "<pre>";
    var_dump ($array);
?>

我就是无法提取href链接！！任何帮助都会非常受欢迎。非常感谢

【问题讨论】：

标签： php xpath web-scraping

【解决方案1】：

首先，该页面中的数据行可以通过更具体的类名row-gray 定位。然后要获取当前div 中的链接，您可以使用相对XPath 表达式.//a[@class='scorelink'] ：

$result = $xpath->query("//div[contains(@class, 'row-gray')]");

foreach($result as $row)
{   
    $text = $row->nodeValue;
    $link = $xpath->query(".//a[@class='scorelink']", $row)->item(0);
    $href = $link->getAttribute("href");

    $array[] = array
    (
        'text' => trim($text),
        'href' => $href
    );

}

【讨论】：

方法是有道理的，但是当我尝试实现它时，我收到“PHP 致命错误：无法使用 DOMNodeList 类型的对象作为数组”。
@Bam 答案更新为 ->item(0) 替换数组索引器 [0]