【问题标题】:Problem With SAX Parsing in JavaJava 中的 SAX 解析问题
【发布时间】:2011-09-23 17:48:58
【问题描述】:

所以我早些时候问了一个问题,只是为了复习 SAX 的一些基础知识,我从答案中学到了很多东西。根据我所学到的,我尝试创建一个 Java 程序,该程序将遍历一堆目录(我正在处理的较大项目的必要部分),然后在目录中找到一个名为 "document.xml.rels" 的文件并使用 @987654322 @ 标识“目标”元素,查看它是否是图像文件(在其关联名称中包含“图像”),然后将目标与 Id 属性链接并执行system.out.print. 我没有收到任何错误,来自编译器或运行时,所以我想知道我是否没有正确遍历目录结构,或者 SaxHandler 类中的条件是否有问题?

只是一些笔记...

我从目录开始:

C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items

我正在尝试结束文件:

C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items/ProposalOne/word/_rels/document.xml.rels

这是我的 Java 代码:

import java.io.*;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.*;
import org.xml.sax.helpers.*;

public class XMLParser
{
    static File directory = new File("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items");
    static File files[] = directory.listFiles();

    public static void main(String[] args) throws IOException
    {
        //For each of the files in "/Extracted Items"...
        for(File f : files)
        {
            //...if it is a directory then...
            if(f.isDirectory())
            {
                //...create a new array populated with each of the files in the directory
                File directoryTwo = new File(f.getAbsolutePath());
                File filesTwo[] = directoryTwo.listFiles();

                //For each of the files in the new directory "/Proposal#"...
                for(File f2 : filesTwo)
                {
                    //...if it is a directory then...
                    if(f2.isDirectory())
                    {
                        //...create a new array populated with each of the files in the directory
                        File directoryThree = new File(f.getAbsolutePath());
                        File filesThree[] = directoryThree.listFiles();

                        //For each of the files in the new directory "/word"
                        for(File f3: filesThree)
                        {
                            //...if it is a directory then...
                            if(f3.isDirectory())
                            {
                                //...create a new array populated with each of the files in the directory
                                File directoryFour = new File(f.getAbsolutePath());
                                File filesFour[] = directoryFour.listFiles();

                                //For each of the files in the new directory "/_rels"
                                for(File f4: filesFour)
                                {
                                    if(f4.getName() == "document.xml.rels")
                                    {
                                        try 
                                        {
                                            // creates and returns new instance of SAX-implementation:
                                            SAXParserFactory factory = SAXParserFactory.newInstance();

                                            // create SAX-parser...
                                            SAXParser parser = factory.newSAXParser();

                                            // .. define our handler:
                                            SaxHandler handler = new SaxHandler();

                                            // and parse:
                                            parser.parse(f3.getAbsolutePath(), handler);    
                                        } 
                                        catch (Exception ex) 
                                        {
                                            ex.printStackTrace(System.out);
                                        }
                                    }
                                    else
                                    {
                                        break;
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }

     private static final class SaxHandler extends DefaultHandler 
     {
         // invoked when document-parsing is started:
         public void startDocument() throws SAXException 
         {
             System.out.println("Document processing started");
         }

         // notifies about finish of parsing:
         public void endDocument() throws SAXException 
         {
             System.out.println("Document processing finished");
         }

         // we enter to element 'qName':
         public void startElement(String uri, String localName, 
                 String qName, Attributes attrs) throws SAXException 
         {
             if(localName.equals("Relationship"))
             {
                 if(attrs.equals("Target"))
                 {
                     if(attrs.getValue("Target").contains("image"))
                     {
                         String id = attrs.getValue("Id");
                         String target = attrs.getValue("Target");
                         System.out.println("Id: " + id + "& Target: " + target);
                     }
                 }
             }  
             else 
             {
                 throw new IllegalArgumentException("Element '" + 
                         qName + "' is not allowed here");
             }
         }

         // we leave element 'qName' without any actions:
         public void endElement(String uri, String localName, String qName)
         throws SAXException 
         {
                // do nothing;
         }
     }
}

这是我正在使用的 xml 文档

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
- <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
      <Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml" /> 
      <Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml" /> 
      <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml" /> 
      <Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header1.xml" /> 
      <Relationship Id="rId12" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml" /> 
      <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" /> 
      <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml" /> 
      <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml" /> 
      <Relationship Id="rId11" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image3.png" /> 
      <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml" /> 
      <Relationship Id="rId10" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image2.jpeg" /> 
      <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml" /> 
      <Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.jpeg" /> 
</Relationships>

新的 Java 代码

import java.io.*;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.*;
import org.xml.sax.helpers.*;

public class XMLParser
{   
    public static void main(String[] args) throws IOException
    {
        traverse(new File("C:/Documents and Settings/rajeeva/workspace/Intern Project/Proposals/Converted Proposals/Extracted Items"));
    }

     private static final class SaxHandler extends DefaultHandler 
     {
         // invoked when document-parsing is started:
         public void startDocument() throws SAXException 
         {
             System.out.println("Document processing started");
         }

         // notifies about finish of parsing:
         public void endDocument() throws SAXException 
         {
             System.out.println("Document processing finished");
         }

         // we enter to element 'qName':
         public void startElement(String uri, String localName, 
                 String qName, Attributes attrs) throws SAXException 
         {
             if(localName.equals("Relationship"))
             {
                 if(attrs.equals("Target"))
                 {
                     if(attrs.getValue("Target").contains("image"))
                     {
                         String id = attrs.getValue("Id");
                         String target = attrs.getValue("Target");
                         System.out.println("Id: " + id + "& Target: " + target);
                     }
                 }
             }  
             else 
             {
                 throw new IllegalArgumentException("Element '" + 
                         qName + "' is not allowed here");
             }
         }

         // we leave element 'qName' without any actions:
         public void endElement(String uri, String localName, String qName)
         throws SAXException 
         {
                // do nothing;
         }
     }

     private static void traverse(File directory)
     {
        //Get all files in directory
        File[] files = directory.listFiles();
        for (File file : files)
        {
           if (file.isDirectory())
           {
              //It's a directory so (recursively) traverse it
              traverse(file);
           }
           else if (file.getName().equals("document.xml.rels"))
           {
               try 
                {
                    System.out.println("5");
                    // creates and returns new instance of SAX-implementation:
                    SAXParserFactory factory = SAXParserFactory.newInstance();

                    // create SAX-parser...
                    SAXParser parser = factory.newSAXParser();

                    // .. define our handler:
                    SaxHandler handler = new SaxHandler();

                    // and parse:
                    parser.parse(file.getAbsolutePath(), handler);    
                } 
                catch (Exception ex) 
                {
                    ex.printStackTrace(System.out);
                }
            }
         }
     }
}

新错误

Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)
5
Document processing started
java.lang.IllegalArgumentException: Element 'Relationships' is not allowed here
    at XMLParser$SaxHandler.startElement(XMLParser.java:48)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at XMLParser.traverse(XMLParser.java:87)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.traverse(XMLParser.java:70)
    at XMLParser.main(XMLParser.java:13)

有什么想法吗?

【问题讨论】:

  • 您是否在输出中看到任何表明您的 SaxHandler(试图)解析文件的内容?
  • 我没有看到任何输出,哈哈。我尝试用 System.out.print 语句乱扔代码,我发现代码永远不会进入条件:if(f4.getName() == "document.xml.rels") 但我觉得我正确地遍历了结构(尽管可能有更好的递归解决方案),我只是不明白为什么找不到文件?

标签: java xml sax xml-parsing directory


【解决方案1】:
       if(f4.getName() == "document.xml.rels")

应该使用

       if(f4.getName().equals("document.xml.rels"))

编辑:重新阅读您的代码我发现另一个问题。

             if(attrs.equals("Target"))

attrs 的类型为 Attributes,因此这种比较永远不会成立。

【讨论】:

  • 哈哈哈,哦,我多么“爱”不得不注意细节:D。谢谢,解决了遍历问题,并且我能够实现更好的递归解决方案来遍历目录...现在我遇到了一系列处理 xml 文件的全新问题,哈哈。如果有人可以提供帮助或想查看递归解决方案,我编辑了上面的帖子以反映更改。
  • @Joe:我编辑了我的答案以包含我发现的另一个问题
  • 有点困惑。根据上面那个 xml 文件,“Target”不是一种属性吗?
  • @Joe:你需要做类似String val = attrs.getValue("Target"); if (val != null) { if (val.contains("image")) { ...
  • 那么你有没有想过这样的事情:if(localName.equals("Relationship")) { String val = attrs.getValue("Target"); if(val != null) { if (val.contains("image")) { String id = attrs.getValue("Id"); System.out.println("Id: " + id + "&amp; Target: " + val); } }}
【解决方案2】:

您向我们显示的错误消息“此处不允许使用关系元素”表示该文档对其 DTD 无效。但是您没有向我们展示 DTD。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-01-25
    • 1970-01-01
    • 2011-11-12
    • 2014-05-30
    相关资源
    最近更新 更多