【问题标题】:Java Saxon xPath example with String as output使用字符串作为输出的 Java Saxon xPath 示例
【发布时间】:2020-01-12 02:30:26
【问题描述】:

我正在尝试编写使用 Saxon xPath 的 Java 代码。我有两个问题:

  1. 我的java不太好
  2. 我不确定将 net.sf.saxon.om.NodeInfo 转换为 String 的最佳方法是什么。

有人可以帮忙吗?我知道http://www.saxonica.com/download/download_page.xml 有一些很好的示例代码,但这还不够。

我看到了类似的 SO 讨论 XPath processor output as string .但是在这种情况下,我想使用 Saxon,它使用 NodeInfo。

<pre>
<!-- language: java --> 
public class helloSaxon {
    public static void main(String[] args) {
        String xml = "";
        String xPathStatement = "";
        String xPathResult = "";
        SaxonXPath xPathEvaluation = null;
        Boolean xPathResultMatch = false;
        
        xml="<root><a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><b><a>#DDD#</a></b></root>";

        //I'm using the following XPath Tester for test scenarios
        //https://www.freeformatter.com/xpath-tester.html#ad-output
        // Test #1
        xPathStatement="/root/a";
        xPathEvaluation = new SaxonXPath(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #1 xPathResult - " + xPathResult);
            //xPathResult == "<a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a>";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #1 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;

        // Test #2
        xPathStatement="//a";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #2 xPathResult - " + xPathResult);
            //xPathResult == "<a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><a>#DDD#</a>";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #2 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;

        // Test #3
        xPathStatement="/root/a[1]/text()";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #3 xPathResult - " + xPathResult);
            //xPathResult == "#BBB#";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #3 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;

        // Test #4
        xPathStatement="/a/root/a/text()";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #4 xPathResult - " + xPathResult);
            //xPathResult == "";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #4 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == false;
            
        // Test #5
        xPathStatement="/root";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #5 xPathResult - " + xPathResult);
            //xPathResult == "<root><a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><b><a>#DDD#</a></b></root>";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #5 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;         
    }
    static class SaxonXPath{
        private String xml;
        private String xPathStatement;
        private String xPathResult;
        private Boolean xPathResultMatch;
        public SaxonXPath(String xml, String xPathStatement){
            this.Reset(xml, xPathStatement);
        }
        public void Reset(String xml, String xPathStatement){
            this.xml = xml;
            this.xPathStatement = xPathStatement;
            this.xPathResult = "";
            this.xPathResultMatch = null;
            this.Evaluate();
        }
        public void Evaluate(){
            try{
                System.setProperty("javax.xml.xpath.XPathFactory:" + NamespaceConstant.OBJECT_MODEL_SAXON, "net.sf.saxon.xpath.XPathFactoryImpl");
                XPathFactory xPathFactory = XPathFactory.newInstance(NamespaceConstant.OBJECT_MODEL_SAXON);
                XPath xPath = xPathFactory.newXPath();
                InputSource inputSource = new InputSource(new StringReader(this.xml));
                SAXSource saxSource = new SAXSource(inputSource);
                Configuration config = ((XPathFactoryImpl) xPathFactory).getConfiguration();
                DocumentInfo document = config.buildDocument(saxSource);      
                XPathExpression xPathExpression = xPath.compile(this.xPathStatement);

                List matches = (List) xPathExpression.evaluate(document, XPathConstants.NODESET);
                if (matches != null && matches.size()>0) {
                    this.xPathResultMatch = true;   
                    for (Iterator iter = matches.iterator(); iter.hasNext();) {
                        NodeInfo node = (NodeInfo) iter.next();
                        
                        //need to convert content of "node" to string
                        xPathResult += node.getStringValue();
                    }
                } else {
                    this.xPathResultMatch = false;
                }
            } catch(Exception e){
                e.printStackTrace();
            }           
        }
        public String getxPathResult(){
            return this.xPathResult;
        }
        public Boolean getxPathResultMatch(){
            return this.xPathResultMatch;
        }
    }
}
</code>

会有以下输入:

  1. XML 作为字符串
  2. xPath 表达式为字符串
    输出:
  3. xPath 评估为字符串
  4. xPath 结果匹配为布尔值

我还在代码 cmets 中添加了一些测试示例,以便您更好地理解我要做什么。

【问题讨论】:

    标签: java xml xpath saxon


    【解决方案1】:

    首先,我建议为此使用 s9api 接口而不是 JAXP XPath 接口。有很多原因,特别是:

    • JAXP 接口非常适合 XPath 1.0,例如它只识别数据类型字符串、数字、布尔值和节点集。 XPath 2.0 具有更丰富的类型系统

    • 1234563 )
    • JAXP 接口几乎没有类型安全性;它广泛使用Object 作为参数和结果类型,并且没有使用Java 泛型

    • 使用标准 API 的任何可移植性优势都是虚假的,因为 (a) 除 Saxon 之外的所有已知实现仅支持 XPath 1.0,以及 (b) 可能提供给声明为接受 @ 的接口的值类型987654322@因产品而异。

    每次计算 XPath 表达式时,您的代码都会创建一个新的 XPathFactory。创建XPathFactory 是一项非常昂贵的操作,因为它涉及搜索类路径并检查许多不同的 JAR 文件以查看哪个包含适当的 XPath 引擎。

    此外,每次计算 XPath 表达式时,您的代码都会从头开始构建源文档。同样,这非常昂贵。

    说了这么多,使用 JAXP 返回字符串和布尔值并不是很困难。您只需将表示预期结果类型的参数 XPathConstants.NODESET 更改为XPathConstants.STRINGXPathConstants.BOOLEANevaluate() 调用将返回一个字符串或布尔值来代替节点列表。但是如果你想返回一个日期或一个持续时间,你会被卡住,因为 JAXP 不支持。

    【讨论】:

    • 感谢@Michael.Kay 为您提供的优秀cmets! :) XPathConstants.STRING 的问题是,在表达式“/root”的情况下,它返回:“#BBB##CCC##DDD#”但是我想要:“#BBB##CCC# #DDD#"
    • 据我了解,我必须使用 NodeInfo 才能拥有标签名称、命名空间、节点内容等,但 NodeInfo 的问题似乎是没有“开箱即用”功能转换为字符串...
    • 如果您想要一个节点的序列化表示作为 XPath 的结果,您可以返回该节点,然后在调用应用程序中进行序列化,或者您可以在其中调用 serialize(node) 函数XPath 表达式本身。
    • 谢谢@michael-kay!不幸的是,我无法将节点返回给调用应用程序,我必须在函数中进行序列化。我用“xPathResult += net.sf.saxon.query.QueryResult.serialize(node);”进行了测试它返回了我想要实现的字符串。你能评论一下“net.sf.saxon.query.QueryResult.serialize(NodeInfo)”吗?手术费用高吗?或者您可能正在考虑“XPath 表达式本身中的序列化(节点)函数”的其他内容?
    • 在调用应用程序中使用QueryResult.serialize() 进行序列化,并在XPath 表达式中使用fn:serialize() 进行序列化,其成本可能完全相同。是的,序列化大型文档是一项昂贵的操作,如果可以的话最好避免。但有时当然是必要的,例如,如果将 XML 保存到文件存储中,。
    【解决方案2】:

    我只是想根据@MichaelKay 的输入添加编辑过的代码。尽管它是昂贵的操作,但我仍在为每个调用执行 buildDocumentTree,因为我将拥有不同的 XML。我希望其他人也会发现它有用或会提供不错的 cmets 以提高性能:)

    import java.io.StringReader;
    import java.util.Iterator;
    import java.util.List;
    import javax.xml.transform.sax.SAXSource;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathExpression;
    import javax.xml.xpath.XPathFactory;
    import javax.xml.xpath.XPathFactoryConfigurationException;
    import net.sf.saxon.Configuration;
    import net.sf.saxon.lib.NamespaceConstant;
    import net.sf.saxon.om.NodeInfo;
    import net.sf.saxon.om.TreeInfo;
    import net.sf.saxon.xpath.XPathFactoryImpl;
    import org.xml.sax.InputSource;
    
    public class helloSaxon {
    
        public static void main(String[] args) {
    
            String xml = "";
            String xPathStatement = "";
            String xPathResult = "";
            SaxonXPath xPathEvaluation = null;
            Boolean xPathResultMatch = false;
    
            xml="<root version = '1.0' encoding = 'UTF-8' xmlns:bar='http://www.smth.org/'><bar:a>#BBB#</bar:a><a>#CCC#</a><b><a>#DDD#</a></b></root>";
    
            //I'm using the following XPath Tester for test scenarios
            //https://www.freeformatter.com/xpath-tester.html#ad-output
            // Test #1
            xPathStatement="/root/a";
    
            xPathEvaluation = new SaxonXPath(xml, xPathStatement);
    
            xPathResult = xPathEvaluation.getxPathResult();
                System.out.println("Test #1 xPathResult - " + xPathResult);
                //xPathResult == "<a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a>";
            xPathResultMatch = xPathEvaluation.getxPathResultMatch();
                System.out.println("Test #1 xPathResultMatch - " + xPathResultMatch);
                //xPathResultMatch == true;
    
            // Test #2
            xPathStatement="//a";
            xPathEvaluation.Reset(xml, xPathStatement);
            xPathResult = xPathEvaluation.getxPathResult();
                System.out.println("Test #2 xPathResult - " + xPathResult);
                //xPathResult == "<a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><a>#DDD#</a>";
            xPathResultMatch = xPathEvaluation.getxPathResultMatch();
                System.out.println("Test #2 xPathResultMatch - " + xPathResultMatch);
                //xPathResultMatch == true;
    
            // Test #3
            xPathStatement="/root/a[1]/text()";
            xPathEvaluation.Reset(xml, xPathStatement);
            xPathResult = xPathEvaluation.getxPathResult();
                System.out.println("Test #3 xPathResult - " + xPathResult);
                //xPathResult == "#BBB#";
            xPathResultMatch = xPathEvaluation.getxPathResultMatch();
                System.out.println("Test #3 xPathResultMatch - " + xPathResultMatch);
                //xPathResultMatch == true;
    
            // Test #4
            xPathStatement="/a/root/a/text()";
            xPathEvaluation.Reset(xml, xPathStatement);
            xPathResult = xPathEvaluation.getxPathResult();
                System.out.println("Test #4 xPathResult - " + xPathResult);
                //xPathResult == "";
            xPathResultMatch = xPathEvaluation.getxPathResultMatch();
                System.out.println("Test #4 xPathResultMatch - " + xPathResultMatch);
                //xPathResultMatch == false;
    
            // Test #5
            xPathStatement="/root";
            xPathEvaluation.Reset(xml, xPathStatement);
            xPathResult = xPathEvaluation.getxPathResult();
                System.out.println("Test #5 xPathResult - " + xPathResult);
                //xPathResult == "<root><a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><b><a>#DDD#</a></b></root>";
            xPathResultMatch = xPathEvaluation.getxPathResultMatch();
                System.out.println("Test #5 xPathResultMatch - " + xPathResultMatch);
                //xPathResultMatch == true;         
        }
        static class SaxonXPath{
            private String xml;
            private String xPathStatement;
            private String xPathResult;
            private Boolean xPathResultMatch;
            private XPathFactory xPathFactory;
            private XPath xPath;
            public SaxonXPath(String xml, String xPathStatement){
                System.setProperty("javax.xml.xpath.XPathFactory:" + NamespaceConstant.OBJECT_MODEL_SAXON, "net.sf.saxon.xpath.XPathFactoryImpl");
                try {
                    this.xPathFactory = XPathFactory.newInstance(NamespaceConstant.OBJECT_MODEL_SAXON);
                } catch (XPathFactoryConfigurationException e) {
                    e.printStackTrace();
                }
                this.xPath = this.xPathFactory.newXPath();
                this.Reset(xml, xPathStatement);
            }
            public void Reset(String xml, String xPathStatement){
                this.xml = xml;
                this.xPathStatement = xPathStatement;
                this.xPathResult = "";
                this.xPathResultMatch = null;
                try{                
                    InputSource inputSource = new InputSource(new StringReader(this.xml));
                    SAXSource saxSource = new SAXSource(inputSource);
                    Configuration config = ((XPathFactoryImpl) this.xPathFactory).getConfiguration();
                    TreeInfo document = config.buildDocumentTree(saxSource);
                    XPathExpression xPathExpression = this.xPath.compile(this.xPathStatement);
                    List<NodeInfo> matches = (List<NodeInfo>) xPathExpression.evaluate(document, XPathConstants.NODESET);
                    if (matches != null && matches.size()>0) {
                        this.xPathResultMatch = true;   
                        for (Iterator<NodeInfo> iter = matches.iterator(); iter.hasNext();) {
                            NodeInfo node = (NodeInfo) iter.next();
    
                            xPathResult += net.sf.saxon.query.QueryResult.serialize(node);
                        }
                    } else {
                        this.xPathResultMatch = false;
                    }
                } catch(Exception e){
                    e.printStackTrace();
                }           
            }
            public String getxPathResult(){
                return this.xPathResult;
            }
            public Boolean getxPathResultMatch(){
                return this.xPathResultMatch;
            }
        }
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2023-03-28
      • 2018-10-19
      • 1970-01-01
      • 2010-10-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多