【问题标题】:How to parse XML file with "Opening and ending tag mismatch" in Java如何在 Java 中解析具有“开始和结束标签不匹配”的 XML 文件
【发布时间】:2016-04-17 17:08:18
【问题描述】:

我有一个带有开放价格标签的 XML 文件。尽管有错误,有没有办法解析文件?如何跳过有错误的产品并继续解析?

<Products>
      <Product Name="Gummi bears">
        <Price Currency="GBP">4.07</Price>
        <BestBefore Date="19-02-2014"/>
      </Product>
      <Product Name="Mounds">
        <Price Currency="AUD">5.64
        <BestBefore Date="08-04-2014"/>
      </Product>
      <Product Name="Vodka">
        <Price Currency="RUB">70</Price>
        <BestBefore Date="11-10-2014"/>
      </Product>
  </Products>

【问题讨论】:

标签: java xml parsing tags


【解决方案1】:

这是代码。这是 BrandonArp 已经提到的实现。

有一个属性需要设置为忽略致命错误 - continue-after-fatal-error

http://apache.org/xml/features/continue-after-fatal-error 
true:   Attempt to continue parsing after a fatal error.  
false:  Stops parse on first fatal error.  
default:    false  
XMLUni Predefined Constant:     fgXercesContinueAfterFatalError  
note:   The behavior of the parser when this feature is set to true is undetermined! Therefore use this feature with extreme caution because the parser may get stuck in an infinite loop or worse.  

更多详情可以查看here

PriceReader 类

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.XMLReader;

public class PriceReader {

    public static void main(String argv[]) {

        try {
        SAXParserFactory factory = SAXParserFactory.newInstance();
        SAXParser saxParser = factory.newSAXParser();

        XMLReader xmlReader = saxParser.getXMLReader();

        try {
            xmlReader.setFeature(
                            "http://apache.org/xml/features/continue-after-fatal-error",
                            true);
        } catch (SAXException e) {
            System.out.println("error in setting up parser feature");
        }

        xmlReader.setContentHandler(new PriceHandler());
        xmlReader.setErrorHandler(new MyErrorHandler());
        xmlReader.parse("bin\\com\\test\\stack\\overflow\\sax\\prices.xml");

    } catch (Throwable e) {
         System.out.println("Error -- " +e.getMessage());
    }

    }
}

PriceHandler 类

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class PriceHandler extends DefaultHandler {

    public void startElement(String uri, String localName,
        String qName, Attributes attributes)
        throws SAXException {

    if (qName.equalsIgnoreCase("Product")) {
        System.out.println("Product ::: "+ attributes.getValue("Name"));
    }
  }
}

MyErrorHandler 类

import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

public class MyErrorHandler implements ErrorHandler {

    private String getParseExceptionInfo(SAXParseException spe) {
        String systemId = spe.getSystemId();

        if (systemId == null) {
            systemId = "null";
        }

        String info = "URI=" + systemId + " Line=" 
            + spe.getLineNumber() + ": " + spe.getMessage();

        return info;
    }

    public void warning(SAXParseException spe) throws SAXException {
        System.out.println("Warning: " + getParseExceptionInfo(spe));
    }

    public void error(SAXParseException spe) throws SAXException {
        String message = "Error: " + getParseExceptionInfo(spe);
        System.out.println(message);
    }

    public void fatalError(SAXParseException spe) throws SAXException {
        String message = "Fatal Error: " + getParseExceptionInfo(spe);
        System.out.println(message);
    }
}

输出

 Product ::: Gummi bears
Product ::: Mounds
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=9: The element type "Price" must be terminated by the matching end-tag "</Price>".
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=9: The end-tag for element type "Price" must end with a '>' delimiter.
Product ::: Vodka
Product ::: Rum
Product ::: Brezzer
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=21: The element type "Price" must be terminated by the matching end-tag "</Price>".
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=21: The end-tag for element type "Price" must end with a '>' delimiter.
Product ::: Water
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=26: The end-tag for element type "Product" must end with a '>' delimiter.
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=26: XML document structures must start and end within the same entity.
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=26: Premature end of file.
 Error -- processing event: -1

【讨论】:

  • 似乎解析在错误后停止,我们没有得到第三个产品。
  • 让我尝试重现
  • 好的。我能够重新制作并解决问题。我将编辑我的答案
【解决方案2】:

处理此类错误的一般方法是使用流式解析器。对于 Java,想到的就是 SAX。

创建处理程序时,您将能够覆盖/实现errorfatalError 方法。这些将允许您继续解析,但这仍然让您处理实际错误。

显然,XML 文档中存在许多可能的错误,只有处理其中一些才有意义。不过,希望这将为您提供一个开始使用解析器的地方。

【讨论】:

    猜你喜欢
    • 2012-09-29
    • 2013-02-03
    • 1970-01-01
    • 2014-10-19
    • 1970-01-01
    • 2012-04-01
    • 2016-09-15
    • 1970-01-01
    • 2011-11-02
    相关资源
    最近更新 更多