【问题标题】:Java: searching inside zips inside zipsJava:在拉链内搜索
【发布时间】:2011-02-22 08:14:05
【问题描述】:

我最近才开始在 java 中处理 zip 文件。到目前为止,一切似乎都计划得很好,但我遇到了最后一个障碍:嵌套拉链。

我正在尝试搜索具有特定扩展名的文件,以便可以将它们作为文本文件读取。到目前为止,我可以很高兴地读取 zip 文件中的文件,但我知道 zip 中有一些文件嵌套在其他 zip 中。

有没有一种方法可以从 ZipEntry 中搜索创建一个 ZipFile 对象而无需解压缩文件?

没有代码示例,因为它不是直接的代码问题。

K.巴拉德

【问题讨论】:

    标签: java io zip nested inputstream


    【解决方案1】:

    ZipInputStream 在解压缩数据的过滤器中包装一个流。因此,当您找到一个嵌套 zip 文件的 ZipEntry 时,您需要将您正在使用的流包装在一个新的 ZipInputStream 中以翻译数据。这可以做到你需要的任何深度。

    【讨论】:

      【解决方案2】:

      我编写了一些代码来从 zip 中读取文件,就好像它们是虚拟文件夹一样,欢迎您看看这个,并了解它是如何工作的。现在,当您说“不解压缩”时,这是不可能的,但我假设您的意思是“不将其解压缩到磁盘”。这段代码确实解压它,但它是在内存中解压的,而不是在磁盘上。

      import java.io.*;
      import java.util.Deque;
      import java.util.LinkedList;
      import java.util.logging.Level;
      import java.util.logging.Logger;
      import java.util.zip.ZipEntry;
      import java.util.zip.ZipException;
      import java.util.zip.ZipFile;
      import java.util.zip.ZipInputStream;
      
      /**
       * Allows read operations to happen transparently on a zip file, as if it were a
       * folder. Nested zips are also supported. All operations are read only.
       * Operations on a ZipReader with a path in an actual zip are expensive, so it's
       * good to keep in mind this when using the reader, you'll have to balance
       * between memory usage (caching) or CPU use (re-reading as needed).
       */
      public class ZipReader {
      
          /**
           * The top level zip file, which represents the actual file on the file system.
           */
          private final File topZip;
      
          /**
           * The chain of Files that this file represents.
           */
          private final Deque<File> chainedPath;
      
          /**
           * The actual file object.
           */
          private final File file;
      
          /**
           * Whether or not we have to dig down into the zip, or if
           * we can use trivial file operations.
           */
          private final boolean isZipped;
      
          /**
           * Creates a new ZipReader object, which can be used to read from a zip
           * file, as if the zip files were simple directories. All files are checked
           * to see if they are a zip.
           * 
           * <p>{@code new ZipReader(new File("path/to/container.zip/with/nested.zip/file.txt"));}</p>
           * 
           *
           * @param file The path to the internal file. This needn't exist, according
           * to File, as the zip file won't appear as a directory to other classes.
           * This constructor will however throw a FileNotFoundException if it
           * determines that the file doesn't exist.
           */
          public ZipReader(File file){
              chainedPath = new LinkedList<File>();
      
              this.file = file;
      
              //make sure file is absolute
              file = file.getAbsoluteFile();
      
              //We need to walk up the parents, putting those files onto the stack which are valid Zips
              File f = file;
              chainedPath.addFirst(f); //Gotta add the file itself to the path for everything to work
              File tempTopZip = null;
              while ((f = f.getParentFile()) != null) {
                  chainedPath.addFirst(f);
                  try {
                      //If this works, we'll know we have our top zip file. Everything else will have
                      //to be in memory, so we'll start with this if we have to dig deeper.
                      if (tempTopZip == null) {
                          ZipFile zf = new ZipFile(f);
                          tempTopZip = f;
                      }
                  } catch (ZipException ex) {
                      //This is fine, it's just not a zip file
                  } catch (IOException ex) {
                      Logger.getLogger(ZipReader.class.getName()).log(Level.SEVERE, null, ex);
                  }
              }
      
              //If it's not a zipped file, this will make operations easier to deal with,
              //so let's save that information
              isZipped = tempTopZip != null;
              if(isZipped){
                  topZip = tempTopZip;
              } else {
                  topZip = file;
              }
      
          }
      
          /**
           * Returns if this file exists or not. Note this is a non-trivial operation.
           * 
           * @return 
           */
          public boolean exists(){
              if(!topZip.exists()){
                  return false; //Don't bother trying
              }
              try{
                  getInputStream().close();
                  return true;
              } catch(IOException e){
                  return false;
              }
          }
      
          /**
           * Returns true if this file is read accessible. Note that if the file is a zip,
           * the permissions are checked on the topmost zip file.
           * @return 
           */
          public boolean canRead(){
              return topZip.canRead();
          }
      
          /**
           * Returns true if this file has write permissions. Note that if the file is nested
           * in a zip, then this will always return false
           * @return 
           */
          public boolean canWrite(){
              if(isZipped){
                  return false;
              } else {
                  return topZip.canWrite();
              }
          }
      
          /*
           * This function recurses down into a zip file, ultimately returning the InputStream for the file,
           * or throwing exceptions if it can't be found.
           */
          private InputStream getFile(Deque<File> fullChain, String zipName, final ZipInputStream zis) throws FileNotFoundException, IOException {
              ZipEntry entry;
              InputStream zipReader = new InputStream() {
      
                  @Override
                  public int read() throws IOException {
                      if (zis.available() > 0) {
                          return zis.read();
                      } else {
                          return -1;
                      }
                  }
      
                  @Override
                  public void close() throws IOException {
                      zis.close();
                  }
              };
              boolean isZip = false;
              while ((entry = zis.getNextEntry()) != null) {
                  //This is at least a zip file
                  isZip = true;
                  Deque<File> chain = new LinkedList<File>(fullChain);
                  File chainFile = null;
                  while ((chainFile = chain.pollFirst()) != null) {
                      if (chainFile.equals(new File(zipName + File.separator + entry.getName()))) {
                          //We found it. Now, chainFile is one that is in our tree
                          //We have to do some further analyzation on it
                          break;
                      }
                  }
                  if (chainFile == null) {
                      //It's not in the chain at all, which means we don't care about it at all.
                      continue;
                  }
                  if (chain.isEmpty()) {
                      //It was the last file in the chain, so no point in looking at it at all.
                      //If it was a zip or not, it doesn't matter, because this is the file they
                      //specified, precisely. Read it out, and return it.
                      return zipReader;
                  }
      
                  //It's a single file, it's in the chain, and the chain isn't finished, so that
                  //must mean it's a container (or it's being used as one, anyways). Let's attempt to recurse.
      
                  ZipInputStream inner = new ZipInputStream(zipReader);
                  return getFile(fullChain, zipName + File.separator + entry.getName(), inner);
      
              }
              //If we get down here, it means either we recursed into not-a-zip file, or 
              //the file was otherwise not found
              if (isZip) {
                  //if this is the terminal node in the chain, it's due to a file not found.
                  throw new FileNotFoundException(zipName + " could not be found!");
              } else {
                  //if not, it's due to this not being a zip file
                  throw new IOException(zipName + " is not a zip file!");
              }
          }
      
          /**
           * Returns a raw input stream for this file. If you just need the string contents,
           * it would probably be easer to use getFileContents instead, however, this method
           * is necessary for accessing binary files.
           * @return An InputStream that will read the specified file
           * @throws FileNotFoundException If the file is not found
           * @throws IOException If you specify a file that isn't a zip file as if it were a folder
           */
          public InputStream getInputStream() throws FileNotFoundException, IOException {
              if (!isZipped) {           
                  return new FileInputStream(file);
              } else {            
                  return getFile(chainedPath, topZip.getAbsolutePath(), new ZipInputStream(new FileInputStream(topZip)));
              }
          }
      
          /**
           * If the file is a simple text file, this function is your best option. It returns
           * the contents of the file as a string.
           * @return
           * @throws FileNotFoundException If the file is not found
           * @throws IOException If you specify a file that isn't a zip file as if it were a folder
           */
          public String getFileContents() throws FileNotFoundException, IOException {
              if (!isZipped) {
                  return FileUtility.read(file);
              } else {            
                  return getStringFromInputStream(getInputStream());
              }
          }
      
          /*
           * Converts an input stream into a string
           */
          private String getStringFromInputStream(InputStream is) throws IOException {
              BufferedReader din = new BufferedReader(new InputStreamReader(is));
              StringBuilder sb = new StringBuilder();
              try {
                  String line;
                  while ((line = din.readLine()) != null) {
                      sb.append(line).append("\n");
                  }
              } catch (IOException ex) {
                  throw ex;
              } finally {
                  try {
                      is.close();
                  } catch (Exception ex) {
                  }
              }
              return sb.toString();
          }
      
          /**
           * Delegates the equals check to the underlying File object.
           * @param obj
           * @return 
           */
          @Override
          public boolean equals(Object obj) {
              if (obj == null) {
                  return false;
              }
              if (getClass() != obj.getClass()) {
                  return false;
              }
              final ZipReader other = (ZipReader) obj;        
              return other.file.equals(this.file);
          }
      
          /**
           * Delegates the hashCode to the underlying File object.
           * @return 
           */
          @Override
          public int hashCode() {
              return file.hashCode();
          }
      
      
      }
      

      【讨论】:

        【解决方案3】:

        当然。您不需要 ZipFile。创建从外部文件的 ZipEntry 读取内容的 ZipInputStream。

        【讨论】:

        • 您需要 ZipFile 才能调用 zipfile.getInputStream(ZipEntry entry) 方法。如果我采用 ZipEntry 的 inputStream 确定我可以将它包装在 ZipInputStream 中并找到子 ZipEntries,但是我可以获取这些子 zip 的输入流以便读取它们吗?
        猜你喜欢
        • 2015-05-20
        • 1970-01-01
        • 2013-10-06
        • 1970-01-01
        • 1970-01-01
        • 2023-03-21
        • 2010-11-06
        • 1970-01-01
        • 2020-04-15
        相关资源
        最近更新 更多