【问题标题】:How to get error's line number while validating a XML file against a XML schema在针对 XML 模式验证 XML 文件时如何获取错误的行号
【发布时间】:2011-05-19 21:31:01
【问题描述】:

我正在尝试根据 W3C XML Schema 验证 XML。

以下代码完成这项工作并在发生错误时报告。但我无法获得错误的行号。它总是返回 -1。

有没有简单的方法来获取行号?

import java.io.File;

import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Source;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;

import org.w3c.dom.Document;
import org.xml.sax.SAXParseException;

    public class XMLValidation {

        public static void main(String[] args) {

            try {
                DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder();
                Document document = parser.parse(new File("myxml.xml"));

                SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
                Source schemaFile = new StreamSource(new File("myschema.xsd"));

                Schema schema = factory.newSchema(schemaFile);

                Validator validator = schema.newValidator();

                validator.validate(new DOMSource(document));

            } catch (SAXParseException e) {
                System.out.println(e.getLineNumber());
                e.printStackTrace();

            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

【问题讨论】:

  • 疯了,其他帖子与 C# 相关,而不是 java。

标签: java xml xsd sax


【解决方案1】:

假设最终目标是拥有一个经过验证的 DOM 实例,那么前面的答案将需要读取 XML 文档两次——首先是验证,然后再次构建对象树。如果将文档作为文件路径给出,那很好,但如果它作为输入流提供,则需要某种解决方法,原则上只能读取一次。

一种更有效的替代方法是在构建对象树时使用验证解析器对照模式检查 XML 文档。有关如何设置模式验证 DOM 解析器的代码,请参见下面的代码:

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;

import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;

import org.w3c.dom.Document;
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

public class XML {
    public static Document load(String xml, String xsd) {
        // The default error handler just prints errors to the standard error output. In
        // order to make the parser interrupt its work once a validation error is found,
        // we need to use a custom handler that throws an exception in response to any
        // reported issues.
        ErrorHandler errorHandler = new ErrorHandler() {
            @Override
            public void error(SAXParseException exception) throws SAXException {
                throw exception;
            }

            @Override
            public void fatalError(SAXParseException exception) throws SAXException {
                throw exception;
            }

            @Override
            public void warning(SAXParseException exception) throws SAXException {
                throw exception;
            }
        };

        try {
            SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
            Schema schema = factory.newSchema(new File(xsd));

            DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
            builderFactory.setNamespaceAware(true);
            builderFactory.setSchema(schema);

            DocumentBuilder builder = builderFactory.newDocumentBuilder();
            builder.setErrorHandler(errorHandler);

            InputStream input = new FileInputStream(xml);
            Document document = builder.parse(input);

            return document;
        }
        catch (SAXParseException e) {
          int row = e.getLineNumber();
          int col = e.getColumnNumber();
          String message = e.getMessage();
          System.out.println("Validation error at line " + row + ", column " + col + ": \"" + message + '"');
        }
        catch (Exception e) {
            e.printStackTrace();
        }

        return null;
    }

    public static void main(String[] args) {
        String xml = args[0];
        String xsd = args[1];

        Document document = load(xml, xsd);
        boolean valid = (document != null);

        System.out.println("Document \"" + xml + "\" is " + (valid ? "" : "not ") + "valid against schema \"" + xsd + '"');
    }
}

【讨论】:

    【解决方案2】:

    替换这一行:

    validator.validate(new DOMSource(document));
    

    通过

    validator.validate(new StreamSource(new File("myxml.xml")));
    

    会让 SAXParseException 包含行号和列号

    【讨论】:

      【解决方案3】:

      我找到了这个

      http://www.herongyang.com/XML-Schema/Xerces2-XSD-Validation-with-XMLReader.html

      似乎提供以下详细信息(包括行号)

      Error:
         Public ID: null
         System ID: file:///D:/herong/dictionary_invalid_xsd.xml
         Line number: 7
         Column number: 22
         Message: cvc-datatype-valid.1.2.1: 'yes' is not a valid 'boolean' 
         value.
      

      使用此代码:

      /**
       * XMLReaderValidator.java
       * Copyright (c) 2002 by Dr. Herong Yang. All rights reserved.
       */
      import java.io.IOException;
      import org.xml.sax.XMLReader;
      import org.xml.sax.helpers.DefaultHandler;
      import org.xml.sax.helpers.XMLReaderFactory;
      import org.xml.sax.SAXException;
      import org.xml.sax.SAXParseException;
      class XMLReaderValidator {
         public static void main(String[] args) {
            String parserClass = "org.apache.xerces.parsers.SAXParser";
            String validationFeature 
               = "http://xml.org/sax/features/validation";
            String schemaFeature 
               = "http://apache.org/xml/features/validation/schema";
            try {
               String x = args[0];
               XMLReader r = XMLReaderFactory.createXMLReader(parserClass);
               r.setFeature(validationFeature,true);
               r.setFeature(schemaFeature,true);
               r.setErrorHandler(new MyErrorHandler());
               r.parse(x);
            } catch (SAXException e) {
               System.out.println(e.toString()); 
            } catch (IOException e) {
               System.out.println(e.toString()); 
            }
         }
         private static class MyErrorHandler extends DefaultHandler {
            public void warning(SAXParseException e) throws SAXException {
               System.out.println("Warning: "); 
               printInfo(e);
            }
            public void error(SAXParseException e) throws SAXException {
               System.out.println("Error: "); 
               printInfo(e);
            }
            public void fatalError(SAXParseException e) throws SAXException {
               System.out.println("Fattal error: "); 
               printInfo(e);
            }
            private void printInfo(SAXParseException e) {
               System.out.println("   Public ID: "+e.getPublicId());
               System.out.println("   System ID: "+e.getSystemId());
               System.out.println("   Line number: "+e.getLineNumber());
               System.out.println("   Column number: "+e.getColumnNumber());
               System.out.println("   Message: "+e.getMessage());
            }
         }
      }
      

      【讨论】:

      • 感谢您的回答,但我应该将 XSD 文件放在哪里?
      【解决方案4】:

      尝试使用 SAXLocator http://download.oracle.com/javase/1.5.0/docs/api/org/xml/sax/Locator.html 解析器不需要提供一个,但如果他们提供,应该报告行号

      我认为你的代码应该包括:

       // this will be called when XML-parser starts reading
          // XML-data; here we save reference to current position in XML:
          public void setDocumentLocator(Locator locator) {
              this.locator = locator;
          }
      

      (见http://www.java-tips.org/java-se-tips/org.xml.sax/using-xml-locator-to-indicate-current-parser-pos.html

      解析器会给你一个定位器,然后你可以用它来获取行号。当发生这种情况时,可能值得打印/调试以查看您是否有有效的定位器

      【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-12-01
      • 1970-01-01
      • 1970-01-01
      • 2010-11-08
      • 2014-06-10
      相关资源
      最近更新 更多