【发布时间】:2023-03-04 19:05:01
【问题描述】:
我重写了一个脚本,该脚本使用 PHP DOM 函数来遍历具有如下结构的 XML 文件:
<file>
<record>
<Source>
<SourcePlace>
<Country>Germany</Country>
</SourcePlace>
</Source>
<Person>
<Name>
<firstname>John</firstname>
<lastname>Doe<lastname>
</Name>
</Person>
</record>
<record>
..
</record>
</file>
我已将它替换为一个脚本,该脚本使用 XMLreader 查找每个单独的记录并将其转换为 DOM 文档,然后对其进行迭代。通过检查节点是否有子节点来完成迭代:
function findLeaves($node) {
echo "nodeType: ".$node->nodeType.", nodeName:". $node->nodeName."\n";
if($node->hasChildNodes() ) {
foreach($node->childNodes as $element) {
findLeaves($element)
}
}
ELSE { <do something with leave> }
}
问题在于 findLeaves() 函数的行为在两者之间发生了变化。在 DOM 下,没有值的节点(如 Source)没有 #text 子节点。上面的输出是:
nodeType:1, nodeName:Source
nodeType:1, nodeName:SourcePlace
nodeType:1, nodeName:Country
nodeType:3, nodeName:#text ```
在 XMLreader 下变成:
nodeType: 1, nodeName:Source
nodeType: 3, nodeName:#text
nodeType: 1, nodeName:SourcePlace
nodeType: 3, nodeName:#text
nodeType: 1, nodeName:Country
在输入这个函数之前,我已经检查了数据的 saveXML() 结果,但是除了一些额外的空格之外,它看起来是一样的。造成这种差异的原因可能是什么?
DOM下findleaves()函数之前加载文件的代码:
$xmlDoc = new DOMDocument();
$xmlDoc->preserveWhiteSpace = false;
$xmlDoc->load($file);
$xpath = new DOMXPath($xmlDoc);
$records = $xpath->query('//record');
foreach($records as $record) {
foreach ($xpath->query('.//Source', $record) as $source_record) {
findleaves($source_record);
}
}
XMLreader下findleaves()函数前加载文件的代码:
$xmlDoc = new XMLReader()
$xmlDoc->open($file)
while ($xmlDoc->read() ) {
if ($xmlDoc->nodeType == XMLReader::ELEMENT && $xmlDoc->name == 'record') {
$record_node = $xmlDoc->expand();
$recordDOM = new DomDocument();
$n = $recordDOM->importNode($record_node,true);
$recordDOM->appendChild($n);document
$recordDOM->preserveWhiteSpace = false;
$xpath = new DOMXPath($recordDOM);
$records = $xpath->query('//record');
foreach($records as $record) {
foreach ($xpath->query('.//Source', $record) as $source_record) {
findleaves($source_record);
}
}
【问题讨论】: