【发布时间】:2014-05-17 00:19:59
【问题描述】:
我是 DOMXPath 的新手,但我正在努力了解更多信息。目前我有一个这样的 HTML 结构:
<span class="1">
<div class="headerClass">
Here you have <span class="spanClass1">some text</span>. And here there is <span class="spanClass2">even more text</span>
</div>
<table class="tableClass" id="tableID">
<tr>
<td>some text</td>
<td>some text</td>
<td>some text</td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website1.com" target="_blank">My Link</a></td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website2.com" target="_blank">My Link</a></td>
</tr>
</table>
</span>
<span class="2">
<div class="headerClass">
Here you have <span class="spanClass1">some text</span>. And here there is <span class="spanClass2">even more text</span>
</div>
<table class="tableClass" id="tableID">
<tr>
<td>some text</td>
<td>some text</td>
<td>some text</td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website1.com" target="_blank">My Link</a></td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website2.com" target="_blank">My Link</a></td>
</tr>
</table>
</span>
... and the spans continue: 3, 4, 5 ... etc
为了从源文件中检索这个 HTML 代码,我使用这个:
$oDomXpath = new DOMXpath($oDom);
$query = "//span[number(@class)=number(@class)]";
$oDomObject = $oDomXpath->query($query);
foreach ($oDomObject as $oObject) {
// WHAT GOES HERE????
}
我需要将以下值存储在一个数组中:
- 所有
<div class="headerClass">的纯文本,不带html标签。 - 所有
<span class="spanClass2">的文字 - 所有网址都在表格内。表格可以有从 0 到很多的任意行数。
我怎样才能做到这一点?我必须在 foreach 循环中放入什么?我是否需要运行另一个查询??
非常感谢您的帮助!
【问题讨论】:
标签: php html dom xpath domxpath