如何查找文件夹中名称包含列表中单词的所有文件？答案

【问题标题】：How do I find all files in a folder whose names contain words from a list?如何查找文件夹中名称包含列表中单词的所有文件？
【发布时间】：2020-05-16 04:55:36
【问题描述】：

我有大量文件名包含数字的列表。另一方面，我有一个数字列表。我需要使用 PowerShell（或任何其他 Windows 资源）查找名称中包含其他列表中任何数字的文件列表。

我知道如何一一找到

Get-ChildItem | Where-Object {$_.Name -like "*123*"}

但我不知道如何在不使用-or 运算符的情况下按整个列表进行搜索。

【问题讨论】：

在下面的评论中你说，“我的搜索列表有数百个数字，所以即使手动也很痛苦。” - 请在您的问题中更明确地添加此要求（不想单独列举数字）。

标签： powershell

【解决方案1】：

get-childitem *123*,*456*,*789*

文件中的模式：

get-childitem -name | select-string (get-content patterns.txt)

【讨论】：

【解决方案2】：

一种有效的方法是使用基于正则表达式的-match、regular-expression matching operator 和交替 (|) 在单个操作中搜索多个模式之一：

$numbers = 42, 43, 44 # ...
Get-ChildItem | Where-Object Name -match ($numbers -join '|')

或者，js2010's helpful answer 表明您可以直接使用Get-ChildItem 的（隐含）-Path 参数（其类型为[string[]]，即路径的数组），一组通配符表达式：

$numbers = 42, 43, 44 # ...
Get-ChildItem ($numbers -replace '^|$', '*')

以上使用-replace operator将*...*中的每个数字括起来；也就是上面的等价于：

Get-ChildItem *42*, *43*, *44*

【讨论】：

【解决方案3】：

试试这个：

$files = ( Get-ChildItem 'path' )

$numbers = 1 .. 100 # or your list contents

foreach( $n in $numbers ) {
    foreach( $f in $files.BaseName ) {
        if( $f -like "*$n*" ) {
            "Found $f"
        }
    }
}

【讨论】：

【解决方案4】：

作为js2010's helpful answer和mklement0 mention，我们可以利用Get-ChildItem-Path参数中的字符串数组来进行过滤。这些是不错的快速优雅的解决方案，对于有限的字符串集来说是很好的解决方案。

@JBourne's comment 提到他有数百个数字要匹配时，这个怪癖就出现了。当我们处理数百个名称以匹配数百个文件名时，这些方法都会以指数方式变慢。例如@Vish's very easy to understand answer 证明了这一点。例如，当您有 100 个数字和 1,000 个文件时，您将执行 100 x 1,000 = 100,000 次评估。我假设Get-ChildItem 的内部代码在处理输入上的string[] 数组时会做类似的事情。

如果我们对纯粹的性能感兴趣，我们就不能使用数组。数组对于存储项目和访问索引位置很有效，但对于随机查询来说很糟糕。我们可以使用的是使用 Regex 和Hashtables 的稍微复杂的方法。尽管 Hashtables 是一个键/值系统，在这种情况下我们不需要“值”，但它们对于查找、匹配和查询大量键非常有效，通常具有“O(1)”级别的成功.例如我们的示例从 O(n*f) 问题到 O(n) 问题，我们只评估 1 x 1,000 = 1,000 次评估。

首先，我们需要我们的键列表：

$FileWithListOfNumbers = @"
123 = Matched file with 123
456 = Matched file with 456
789 = Matched file with 789
"@

$KeyHashtable = ConvertFrom-StringData $FileWithListOfNumbers

这将使用键列表加载我们的哈希表。接下来，我们遍历我们的文件并使用正则表达式来匹配我们的文件名：

Get-ChildItem | % {
    if($_.Name -match '\D*(\d+)\D*')
    {
        #Filename contains a number, perform a key lookup to see if it matches
        if($KeyHashtable.ContainsKey($Matches[1]))
        {
            Write-Host $_.Name
        }
    }
}

通过使用正则表达式进行匹配（而不是文件系统提供程序进行过滤），我们可以使用匹配组来“提取”数字。您可能需要根据您的特定需求和文件命名约定来调整正则表达式，但它是：

-match '\D*(\d+)\D*'

\D*    - Match 0 or more non-digits
 (     - Start of capture group
  \d+  - Match 1 or more digits
 )     - End of capture group
\D*    - Match 0 or more non-digits

我们“拉”的那个数字存储在第二个数组位置$Matches[1] 的特殊$Matches 变量中。然后，我们使用该数字执行密钥查找，以查看它是否与我们正在寻找的任何内容匹配。

【讨论】：