除了使用 Xpath 之外，还有其他方法吗？答案

【问题标题】：is there any way other than using Xpath for this?除了使用 Xpath 之外，还有其他方法吗？
【发布时间】：2015-12-08 13:24:26
【问题描述】：

大家好，我正在编写这个程序：

import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;

public class DOMbooks {
   public static void main(String[] args) throws Exception {
      DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
      DocumentBuilder docBuilder = factory.newDocumentBuilder();
      File file = new File("books-fixed.xml");
      Document doc = docBuilder.parse(file);
      NodeList list = doc.getElementsByTagName("*");
      int bookCounter = 1;
      for (int i = 1; i < list.getLength(); i++) {
         Element element = (Element)list.item(i);
         String nodeName = element.getNodeName();
         if (nodeName.equals("book")) {
            bookCounter++;
            System.out.println("BOOK " + bookCounter);
            String isbn = element.getAttribute("sequence");
            System.out.println("\tsequence:\t" + isbn);
         } 
         else if (nodeName.equals("author")) {
            System.out.println("\tAuthor:\t" + element.getChildNodes().item(0).getNodeValue());
         }
         else if (nodeName.equals("title")) {
            System.out.println("\tTitle:\t" + element.getChildNodes().item(0).getNodeValue());
         } 
         else if (nodeName.equals("publishYear")) {
            System.out.println("\tpublishYear:\t" + element.getChildNodes().item(0).getNodeValue());
         } 
         else if (nodeName.equals("genre")) {
            System.out.println("\tgenre:\t" + element.getChildNodes().item(0).getNodeValue());
         } 
      }
   }
}

我想打印有关“科幻小说”书籍的所有数据。我知道我应该使用 Xpath，但它卡住了，错误太多...有什么建议吗？假设我有这张桌子，我只想选择带有所有信息的科幻书籍

 <book sequence="5">
  <title>Aftershock</title> 
  <auther>Robert B. Reich</auther> 
  <publishYear>2010</publishYear> 
  <genre>Economics</genre> 
  </book>
- <book sequence="6">
  <title>The Time Machine</title> 
  <auther>H.G. Wells</auther> 
  <publishYear>1895</publishYear> 
  <genre>Science Fiction</genre>

假设我有这张桌子，我只想打印包含所有信息的科幻书籍...

【问题讨论】：

为什么 XPath 卡住了太多错误？它作为查询 XML 的事实工具已有超过 15 年的历史，并且一直非常稳定。您使用什么处理器遇到错误（假设您的意思是处理器中的错误）？
是的，我删除了我编写 Xpath 块的部分，实际上它是垃圾..我导入了许多不必要的包并编写了许多不必要的代码行，我对它完全陌生..我正在尝试已经 4 小时了，但似乎没有任何效果
让我们后退一步。您为什么不向我们展示输入 XML 的一个（小但相关的）部分，对于预期的输出 XML 也是如此。也许 XPath 不好，你需要 XSLT。许多人将 Java 与 XML 技术一起使用没有问题，但使用像您现在这样更难的技术会使情况变得更糟......（imo）
我编辑了，你现在可以吃鸡了...

标签： java xml dom xpath

【解决方案1】：

我想打印有关“科幻”书籍的所有数据。我知道我应该使用 Xpath，但它卡住了，

我假设你的意思是你想要genre == "Science Fiction" 的所有书籍，对吧？在这种情况下，XPath 确实比您在 Java 中尝试的任何方法都简单得多（您不显示根注释，所以我将从 '//' 开始，它可以选择任何深度）：

//book[genre = 'Science Fiction']

XSLT 方法来简化事情

现在，再次查看您的代码，您似乎想要打印每个元素，包括元素的名称。这在 XSLT 中更简单：

<!-- every XSLT 1.0 must start like this -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <!-- you want text -->
    <xsl:output method="text" />

    <!-- match any science fiction book (your primary goal) -->
    <xsl:template match="book[genre = 'Science Fiction']">

        <xsl:text>BOOK </xsl:text>
        <xsl:value-of select="position()" />

        <!-- send the children and attribute to be processed by templates -->
        <xsl:apply-templates select="@sequence | *" />
    </xsl:template>

    <!-- "catch" any elements or attributes under <book> -->
    <xsl:template match="book/* | book/@*">

        <!-- a newline and a tab per line-->
        <xsl:text>&#xA;&#9;</xsl:text>

        <!-- and the name of the element or attribute -->
        <xsl:value-of select="local-name()" />

        <!-- another tab, plus contents of the element or attribute -->
        <xsl:text>&#9;</xsl:text>
        <xsl:value-of select="." />
    </xsl:template>

    <!-- make sure that other values are ignored, but process children -->
    <xsl:template match="node()">
        <xsl:apply-templates />
    </xsl:template>

</xsl:stylesheet>

您可以使用此代码，它比原始代码更短（如果您忽略 cmets 和空格）并且（可以说，一旦您掌握了它）更易读。要使用它：

将其存储为books.xsl

然后，只需使用这个 (copied and changed from here)：

import javax.xml.transform.*;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;

public class TestMain {
    public static void main(String[] args) throws IOException, URISyntaxException, TransformerException {
        TransformerFactory factory = TransformerFactory.newInstance();
        Source xslt = new StreamSource(new File("books.xsl"));
        Transformer transformer = factory.newTransformer(xslt);

        Source text = new StreamSource(new File("books-fixed.xml"));
        transformer.transform(text, new StreamResult(new File("output.txt")));
    }
}

XPath 2.0

如果你可以在 Java 中使用Saxon，那么上面的代码就变成了 XPath 2.0 的单行代码，你甚至不需要 XSLT：

for $book in //book[genre = 'Science Fiction']
return (
    'BOOK', 
    count(//book[genre = 'Science Fiction'][. << $book]) + 1,
    for $tag in $book/(@sequence | *)
    return $tag/local-name(), ':', string($tag)
)

【讨论】：