如何使用 Xpath 1.0 从 XML 文档中查找最大属性答案

【问题标题】：How to find the max attribute from an XML document using Xpath 1.0如何使用 Xpath 1.0 从 XML 文档中查找最大属性
【发布时间】：2012-01-02 14:33:16
【问题描述】：

有没有办法使用 Xpath 1.0 查询 XML 文档以返回给定属性的最大值？

例如有没有办法获取最大id？

<?xml version="1.0" encoding="utf-8"?>
<library>
        <book id="2" name="Dragon Tatoo"/>
        <book id="7" name="Ender's Game"/>
        <book id="3" name="Catch 22"/>
        <book id="1" name="Lord of the rings"/>
</library>

【问题讨论】：

执行 XPath 的宿主语言是什么？如果您使用的是 XPath 1.0（它没有 max 函数），那么首先选择所有元素并在 PL 中找到最大值可能会更快。

标签： xml xpath xpath-1.0

【解决方案1】：

在 XPath 2.0 中，使用 max 函数。要找到id 最高的书，请执行

/library/book[@id = max(/library/book/@id)]

【讨论】：

看起来 max 函数不是 Xpath 1.0 的一部分 :(
@HerbSpiral：嗯。在 XQilla XPath 1.0 兼容模式下尝试过，它可以工作，但也许这不是真正的 XPath 1.0。

【解决方案2】：

以下 XPath 选择具有最高 id 的书：

/library/book[not(@id <= preceding-sibling::book/@id) and not(@id <=following-sibling::book/@id)]

【讨论】：

这确实有效，但性能不是很好（当文档中存在数千个 id 时）
+1 - 我重复了您回答的核心内容，但我只是想在回答中提供更多信息，包括围绕 cmets 展开的一些内容。
如果所有元素都具有相同的值，则不起作用

【解决方案3】：

我发现像 lwburk's 或 timbooo's 这样的答案适用于表示只有一位数字的数字的属性。但是，如果属性是一个多于一位的数字，则在比较属性值时似乎会发生一些奇怪的事情。例如，尝试使用以下内容更改原始 XML 数据：

<?xml version="1.0" encoding="utf-8"?>
<library>
        <book id="250" name="Dragon Tatoo"/>
        <book id="700123" name="Ender's Game"/>
        <book id="305" name="Catch 22"/>
        <book id="1070" name="Lord of the rings"/>
</library>

运行建议的 sn-ps 将不起作用。我使用应用于 id 属性的转换运算符 xs:int() 得到了一个解决方案，例如：

/library/book[not(xs:int(@id) <= preceding-sibling::book/@id) and not(xs:int(@id) <=following-sibling::book/@id)]

这将给出正确答案！

【讨论】：

【解决方案4】：

如果您愿意使用外部工具 - 这取决于您的实现是否包含这些工具的实现 - 请尝试使用 EXSLT:Math 函数 highest()。

EXSLT 实现这一点的事实意味着这样的特性在普通的 xpath 中当然不能直接使用。如果您不使用转换，或者只想坚持使用符合标准的标记，其他发帖者的建议会是更好的选择。

【讨论】：

【解决方案5】：

注意：以下信息假定使用 XPath 1.0。

以下表达式返回具有最大 id 值的元素：

/*/book[not(@id < preceding-sibling::book/@id) and 
        not(@id < following-sibling::book/@id)]

请注意，这与@timbooo 的答案略有不同，因为当存在具有相同最大值的重复项时，这将返回多个元素（@timbooo 将返回无）。如果在这种情况下您只需要一个元素，那么您需要一个解决策略。要选择文档顺序中的第一个此类元素，请使用以下命令：

/*/book[not(@id < preceding-sibling::book/@id) and 
        not(@id < following-sibling::book/@id)][1]

要选择最后一个，请使用：

/*/book[not(@id < preceding-sibling::book/@id) and 
        not(@id < following-sibling::book/@id)][last()]

这种方法效率非常低（O(n^2)），因为它要求您将每个元素与每个其他潜在最大值进行比较。因此，最好使用宿主编程语言来选择最大元素。只需先选择所有 book 元素，然后从该列表中选择最大值。这（很可能）是一个线性操作（O(n)），在非常大的文档上会明显更快。例如，在 Java (JAXP) 中，您可能会这样做：

XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("/*/book", doc,
        XPathConstants.NODESET);
Node max = nodes.item(0);
for (int i = 0; i < nodes.getLength(); i++) {
    int maxval = Integer.parseInt(max.getAttributes()
            .getNamedItem("id").getNodeValue());
    int curval = Integer.parseInt(nodes.item(i).getAttributes()
            .getNamedItem("id").getNodeValue());
    if (curval >= maxval)
        max = nodes.item(i);
}
System.out.println(max.getAttributes().getNamedItem("name"));

请注意，这只是一个演示；确保在适当的地方包含空检查。

【讨论】：

【解决方案6】：

XPath 1.0

/library/book[not(@id < /library/book/@id)]

这种查询风格更通用，即使书籍被分组也可以工作，即

<?xml version="1.0" encoding="utf-8"?>
<library>
    <genre id="1">
        <book id="2" name="Dragon Tatoo"/>
        <book id="7" name="Ender's Game"/>
    </genre>
    <genre id="2">
        <book id="3" name="Catch 22"/>
        <book id="1" name="Lord of the rings"/>
    </genre>
</library>

同样的查询仍然有效（应该修改路径）

/library/genre/book[not(@id < /library/genre/book/@id)]

甚至

//book[not(@id < //book/@id)]

为了避免性能问题，请改用 XPath 2 max()

【讨论】：

【解决方案7】：

这个例子可以用来求最大值。

XmlDocument doc = new XmlDocument();                    
doc.Load("../../Employees.xml");
XmlNode node = doc.SelectSingleNode("//Employees/Employee/@Id[not(. <=../preceding-sibling::Employee/@id) and not(. <=../following-sibling::Employee/@Id)]");
int maxId = Convert.ToInt32(node.Value);

有关 xpath 和 linq 的其他类似主题，请查看 http://rmanimaran.wordpress.com/2011/03/20/xml-find-max-and-min-value-in-a-attribute-using-xpath-and-linq/

【讨论】：