这个 XPath 1.0 表达式准确地选择了想要的节点:
/*/span[.='Heading4']
/following-sibling::text()
[count(.|/*/span[.='Heading5']/preceding-sibling::text())
=
count(/*/span[.='Heading5']/preceding-sibling::text())
]
[normalize-space()]
它是由著名的 Kayessian 方法产生的,用于两个节点集 $ns1 和 $ns2 的交集:
$ns1[count(.|$ns2) = count($ns2)]
如果在 Kayessian 公式中我们将$ns1 替换为:
/*/span[.='Heading4']/following-sibling::text()
和$ns2 与:
/*/span[.='Heading5']/preceding-sibling::text()
最后的谓词[normalize-space()] 从这个交集过滤掉只有空白的文本节点。
基于 XSLT 的验证:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/span[.='Heading4']
/following-sibling::text()
[count(.|/*/span[.='Heading5']/preceding-sibling::text())
=
count(/*/span[.='Heading5']/preceding-sibling::text())
]
[normalize-space()]
"/>
</xsl:template>
</xsl:stylesheet>
在提供的 XML 文档上应用此转换时(替换实体——因为我们没有定义它们的 DTD,这在这里不是必需的):
<html>
<span>Heading</span>
<br />
<br />
<span>Heading1</span>
<br /> data#1
<br />
<br />
<span>Heading4</span>
<br /> #acirc;#euro;#cent; data#4.1
<br /> #acirc;#euro;#cent; data#4.2
<br /> #acirc;#euro;#cent; data#4.3
<br /> #acirc;#euro;#cent; data#4.4
<br />
<br />
<span>Heading5</span>
<br /> #acirc;#euro;#cent; data#5.1
<br /> #acirc;#euro;#cent; data#5.2
<br /> #acirc;#euro;#cent; data#5.3
<br />
<br />
</html>
计算 Xpath 表达式并将计算结果复制到输出:
#acirc;#euro;#cent; data#4.1
#acirc;#euro;#cent; data#4.2
#acirc;#euro;#cent; data#4.3
#acirc;#euro;#cent; data#4.4