如何通过元素 id 读取 XML？答案

【问题标题】：How to reading an XML by its element id's?如何通过元素 id 读取 XML？
【发布时间】：2014-09-14 01:17:17
【问题描述】：

这是我当前的 XML 文件，它为我提供了不同字符的对话，或者至少它应该。我希望它能够工作，以便我可以指定实体 ID 和选项/任务 ID 并获取输出。所以我该怎么做？感谢您的帮助，非常感谢。

<?xml version="1.0"?>
<dialoge>
<entity id="1"> <!-- questgiver -->
    <quest id="1">
        <option id="1">
            <precondition>player has not started quest</precondition>
            <output>hello there, can you kill 2 enemies for me?</output>
        </option>
        <option id="2">
            <precondition>player has completed quest and player has not...</precondition>
            <output>thankyou, have a sword for your troubles.</output>
        </option>
        <option id="3">
            <precondition>player has not finished quest</precondition>
            <output>you haven't finished yet.</output>
        </option>
        <option id="4">
            <outpur>thank you.</outpur>
        </option>
    </quest>
</entity>
<entity id="2"> <!-- villager -->
    <option id="1">
        <precondition>village is being destroyed</precondition>
        <output>our village is being destroyed, please help us!</output>
    </option>
    <option id="2">
        <precondition>village has been saved or destroyed</precondition>
        <output>we will never forget this.</output>
    </option>
    <option id="3">
        <output>hello.</output>
    </option>
</entity>
</dialoge>

这是我目前拥有的，但它不起作用。我知道这可能是一个愚蠢的问题，但我在网络上的任何地方都找不到答案。谢谢。

public static void read() {
    try {
        File file = new File("text.xml");
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document doc = db.parse(file);
        doc.getDocumentElement().normalize();

        System.out.println("root of xml file " + doc.getDocumentElement().getNodeName());
        NodeList nodes = doc.getElementsByTagName("entity");
        System.out.println("==========================");

        for(int i = 0; i < nodes.getLength(); i++) {
            Node node = nodes.item(i);
            if(node.getNodeType() == Node.ELEMENT_NODE) {
                Element element = (Element) node;
                        if(element.getElementsByTagName("entity").item(0).getTextContent().equals("output")) {

                }
                System.out.println("" + getValue("output", element));
            }
        }
    }catch(Exception e) {
        e.printStackTrace();
    }
}

private static String getValue(String tag, Element element) {
    NodeList nodes = element.getElementsByTagName(tag).item(0).getChildNodes();
    Node node = (Node) nodes.item(0);
    return node.getNodeValue();
}

【问题讨论】：

标签： java xml xml-parsing

【解决方案1】：

最简单的方法可能是使用 XPath...

try {
    File file = new File("text.xml");
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    Document doc = db.parse(file);
    doc.getDocumentElement().normalize();

    XPath xpath = XPathFactory.newInstance().newXPath();
    XPathExpression xExpress = xpath.compile("//*[@id='1']");
    NodeList nl = (NodeList) xExpress.evaluate(doc, XPathConstants.NODESET);
    System.out.println("Found " + nl.getLength() + " matches");
} catch (Exception e) {
    e.printStackTrace();
}

xpath 查询//*[@id='1'] 将查找文档中具有属性id 且值为1 的所有节点

查看 WC3 XPath Tutorial 和 How XPath works 了解有关 XPath 的更多详细信息

【讨论】：

如果我只想找到entity 标签？ :)
//entity[@id='1'] 或 //entity 如果您想要所有实体而不考虑 id 值
看到我的 xml 文件有两个不同的 id 元素，并且它们都有也有 id 的子元素，我应该如何找到它们？我会先做实体[@id='1']，然后再做选项[@id='1']？谢谢，这似乎是一个很好的解决方案。
这取决于，你是只对子元素感兴趣还是你也需要父元素？您可以使用 //entity[@id='1']/quest/option[@id='1']，它将返回所有 id 为 1 的选项元素，它们是实体/任务节点的子节点，其中entity 的 id 为 1。您还可以执行两个单独的查询，首先获取实体节点，使用实体节点作为锚点（而不是 doc）并搜索选项节点。将 xPath 视为 XML 的查询语言，它非常强大

【解决方案2】：

一般来说，DOM 更易于使用，但在开始使用之前需要解析 entire XML 的开销，因为 SAX parser 正在解析 XML，并遇到一个开始的标记（例如 <something>），然后它触发startElement 事件（事件的实际名称可能不同）。 read more..

参见Parsing an XML File Using SAX上的Java 教程

这里是示例代码：

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class GetElementAttributesInSAXXMLParsing extends DefaultHandler {

    public static void main(String[] args) throws Exception {
        DefaultHandler handler = new GetElementAttributesInSAXXMLParsing();
        SAXParserFactory factory = SAXParserFactory.newInstance();
        factory.setValidating(false);
        SAXParser parser = factory.newSAXParser();
        parser.parse(new File("text.xml"), handler);    
    }

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes)
            throws SAXException {

        System.out.println("Tag Name:"+qName);
        // get the number of attributes in the list
        int length = attributes.getLength();

        // process each attribute
        for (int i = 0; i < length; i++) {

            // get qualified (prefixed) name by index
            String name = attributes.getQName(i);
            System.out.println("Attribute Name:" + name);

            // get attribute's value by index.
            String value = attributes.getValue(i);
            System.out.println("Attribute Value:" + value);
        }
    }
}

【讨论】：

感谢您发布此内容非常有帮助，但我对 startElement() 感到困惑，在我阅读的内容中，它似乎会被自动调用，这是怎么发生的？还有什么是指String uri，String localName...等的变量。谢谢您的帮助。
@user3053027 它被称为访问者模式。 sax 解析器将自动调用处理程序的方法以响应读取 XML 文档时的更改。如果您只想从头到尾解析一次文档，这种方法非常好。如果要分别查询文档，则需要使用 DOM 方法，这将允许您随意跳转到文档中，向任意方向移动并执行重复查询...