全局理解答案

【问题标题】：Glob understanding全局理解
【发布时间】：2016-09-03 05:26:23
【问题描述】：

我需要使用以下选项/参数在 java 中开发文件扫描器：

一个目录
一个或多个模式，例如 *.xml、*.txt、*test.csv
递归扫描切换

我认为最好的方法是这样的：

public class FileScanningTest {

    public static void main(String[] args) throws IOException {

        String directory = "C:\\tmp\\scanning\\";
        String glob      = "**/*.xml";
        Boolean rekursiv = false;

        final PathMatcher pathMatcher = FileSystems.getDefault().getPathMatcher("glob:"+glob);

        Files.walkFileTree(Paths.get(directory), new SimpleFileVisitor<Path>() {

            @Override
            public FileVisitResult visitFile(Path path, BasicFileAttributes attrs) throws IOException {
                if (pathMatcher.matches(path)) {
                    System.out.println(path);
                } 
                return FileVisitResult.CONTINUE;
            }

            @Override
            public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
                return FileVisitResult.CONTINUE;
            }
        });

    }

}

我不明白为什么我必须在我的实际模式前面加上“**/”。这也使扫描递归。如果我删除 **/，应用程序将找不到任何东西。

https://docs.oracle.com/javase/tutorial/essential/io/fileOps.html#glob 告诉 ** 意味着递归，但是如果我删除它为什么它不起作用？

有人可以给我一个提示吗？

谢谢大家，周末愉快

【问题讨论】：

标签： java glob

【解决方案1】：

要使用从目录/tmp/scanning/ 开始的glob 递归查找*.xml，请查看此示例。它适用于 Linux Ubuntu，可以满足您的需求。它的工作方式类似于 Unix 的 find 实用程序。我没有在 Ubuntu 以外的其他操作系统上测试它，但你应该只需要更改文件名分隔符。

import java.io.*;
import java.nio.file.*;
import java.nio.file.attribute.*;

import static java.nio.file.FileVisitResult.*;
import static java.nio.file.FileVisitOption.*;

import java.util.*;


public class FileScanningTest {

    public static class Finder
            extends SimpleFileVisitor<Path> {

        private final PathMatcher matcher;
        private int numMatches = 0;

        Finder(String pattern) {
            matcher = FileSystems.getDefault()
                    .getPathMatcher("glob:" + pattern);
        }

        // Compares the glob pattern against
        // the file or directory name.
        void find(Path file) {
            Path name = file.getFileName();
            if (name != null && matcher.matches(name)) {
                numMatches++;
                System.out.println(file);
            }
        }

        // Prints the total number of
        // matches to standard out.
        void done() {
            System.out.println("Matched: "
                    + numMatches);
        }

        // Invoke the pattern matching
        // method on each file.
        @Override
        public FileVisitResult visitFile(Path file,
                                         BasicFileAttributes attrs) {
            find(file);
            return CONTINUE;
        }

        // Invoke the pattern matching
        // method on each directory.
        @Override
        public FileVisitResult preVisitDirectory(Path dir,
                                                 BasicFileAttributes attrs) {
            find(dir);
            return CONTINUE;
        }

        @Override
        public FileVisitResult visitFileFailed(Path file,
                                               IOException exc) {
            System.err.println(exc);
            return CONTINUE;
        }
    }


    public static void main(String[] args)
            throws IOException {
        boolean recursive = false;
        Path startingDir = Paths.get("/tmp/scanning");
        String pattern = "*.{html,xml}";

        Finder finder = new Finder(pattern);
        if (!recursive) {
            Path dir = startingDir;
            List<File> files = new ArrayList<>();
            try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir, "*.{xml,html}")) {
                for (Path entry : stream) {
                    files.add(entry.toFile());
                }

                for (File xmlfile : files) {
                    System.out.println(xmlfile);
                }
            } catch (IOException x) {
                throw new RuntimeException(String.format("error reading folder %s: %s",
                        dir,
                        x.getMessage()),
                        x);
            }
        } else {    
            Files.walkFileTree(startingDir, finder);
            finder.done();
        }

    }
}

测试

 ~> java FileScanningTest
/tmp/scanning/dir2/test2.xml
/tmp/scanning/blah.xml
Matched: 2

如果你想匹配*.xml 或test3.html，那么你可以使用这个模式：String pattern = "{*.xml,test3.html}";

【讨论】：

谢谢。这也适用于 Windows 平台。但是如何禁用递归扫描？我们确实可以选择禁用递归扫描。有没有办法同时拥有几种不同的模式？
@Hauke 是的。要禁用递归扫描，您只需要用户提供一个选项，例如args[0] 并且您可以将几种不同的模式与正则表达式一起使用。我可以给你看，或者你可以问一个新问题。我可以根据您的规范更新我发布的代码。我也喜欢学习如何在 Java 中做到这一点（我经常在 C 中这样做。）
如果你能更新你的代码那就太好了。这只是一个测试类，可以使用布尔标志“递归”而不是分析应用程序参数。感谢您的帮助
@Hauke 这适用于多种文件类型：String pattern = "*.{html,xml}";
伟大的工作。我怎么能做一个像“.xml”和“test.html”这样的模式？

【解决方案2】：

* 和 ** 之间的区别在于 * 永远不会匹配目录分隔符（/ 或 \ 取决于您的操作系统），但 ** 会。给定这样的文件树：

a/
  b.xml
c/
  a.xml
da.xml

模式*a.xml 将只匹配da.xml（不匹配c/a.xml，因为它包含/），而模式**a.xml 将匹配da.xml 和c/a.xml，以及模式@987654335 @ 只会匹配 a/b.xml。

【讨论】：

感谢您的信息。这意味着如果我已经位于目录中而不给出整个路径，我只能使用模式 *a.xml？