PowerShell逐行读取文本文件并在文件夹中找到丢失的文件答案

【问题标题】：PowerShell read text file line by line and find missing file in foldersPowerShell逐行读取文本文件并在文件夹中找到丢失的文件
【发布时间】：2017-11-07 21:40:12
【问题描述】：

我是新手，正在寻求帮助。我有一个包含两列数据的文本文件。一列是供应商，一列是发票。我需要逐行扫描该文本文件，并查看路径中的供应商和发票是否匹配。在路径 $Location 中，第一个通配符是供应商编号，第二个通配符是发票我希望将不匹配输出到文本文件。

$Location = "I:\\Vendors\*\Invoices\*"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output ="I:\\Vendors\Missing\Missing.txt"
foreach ($line in Get-Content $txt) {
if (-not($line -match $location)){$line}
}
set-content $Output -value $Line

来自 txt 或 csv 文件的样本数据。

kvendnum    wapinvoice
000953  90269211
000953  90238674
001072  11012016
002317  448668
002419  06123711
002419  06137343
002419  06134382
002419  759208
002419  753087
002419  753069
002419  762614
003138  N6009348
003138  N6009552
003138  N6009569
003138  N6009612
003182  770016
003182  768995
003182  06133429

在上面的数据中，唯一的匹配是在第二行：000953 90238674 第6行：002419 06137343

【问题讨论】：

您将通配符语法与 -match 运算符一起使用，该运算符需要正则表达式。将 -like 或 -notlike 与通配符一起使用。

标签： powershell

【解决方案1】：

未经测试，但这是我的处理方法：

$Location = "I:\\Vendors\\.+\\Invoices\\.+"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output ="I:\\Vendors\Missing\Missing.txt"
select-string -path $txt -pattern $Location -notMatch |
    set-content $Output

无需逐行浏览文件； PowerShell 可以使用select-string 为您执行此操作。 -notMatch 参数只是反转搜索并通过与模式不匹配的任何行发送。

select-string 发送一个matchinfo 对象流，其中包含满足搜索条件的行。这些对象实际上包含比匹配行更多的信息，但幸运的是 PowerShell 足够聪明，知道如何将相关项目发送到 set-content。

正则表达式可能很难正确处理，但如果您要执行此类任务，则值得一试。

编辑

$Location  = "I:\Vendors\{0}\Invoices\{1}.pdf"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output  = "I:\Vendors\Missing\Missing.txt"

get-content -path $txt | 
    % {

        # extract fields from the line
        $lineItems = $_ -split "  "

        # construct path based on fields from the line
        $testPath = $Location -f $lineItems[0], $lineItems[1]

        # for debugging purposes
        write-host ( "Line:'{0}'  Path:'{1}'" -f $_, $testPath )

        # test for existence of the path; ignore errors
        if ( -not ( get-item -path $testPath -ErrorAction SilentlyContinue ) ) {
            # path does not exist, so write the line to pipeline
            write-output $_ 

        }

    } |
    Set-Content -Path $Output

我想我们最终将不得不逐行挑选文件。如果有更惯用的方法来做到这一点，那我就不知道了。

上面的代码假定输入文件中的格式一致，并使用-split 将行分成一个数组。

编辑 - 版本 3

$Location  = "I:\Vendors\{0}\Invoices\{1}.pdf"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output  = "I:\Vendors\Missing\Missing.txt"

get-content -path $txt | 
    select-string "(\S+)\s+(\S+)" | 
    %{

        # pull vendor and invoice numbers from matchinfo     
        $vendor = $_.matches[0].groups[1]
        $invoice = $_.matches[0].groups[2]

        # construct path
        $testPath = $Location -f $vendor, $invoice

        # for debugging purposes
        write-host ( "Line:'{0}'  Path:'{1}'" -f $_.line, $testPath )

        # test for existence of the path; ignore errors
        if ( -not ( get-item -path $testPath -ErrorAction SilentlyContinue ) ) {
            # path does not exist, so write the line to pipeline
            write-output $_ 
        }

    } |
    Set-Content -Path $Output

似乎-split " " 在运行脚本中的行为与其在命令行中的行为不同。诡异的。无论如何，这个版本使用正则表达式来解析输入行。我根据原始帖子中的示例数据对其进行了测试，它似乎有效。

正则表达式分解如下

(     Start the first matching group
\S+   Greedily match one or more non-white-space characters
)     End the first matching group
\s+   Greedily match one or more white-space characters
(     Start the second matching group
\S+   Greedily match one or more non-white-space characters
)     End the second matching groups

【讨论】：

感谢您将其分解，非常有用。它正在读取每一行并输出到 $Output 路径，但它不只输出不匹配的项目或丢失的项目。它正在将 $txt 中的所有行写入输出。
所以我从 $txt 文件中删除了 Vendor 列，只保留了 Invoice 列。它现在正在工作，并且仅从 $txt 输出不匹配的行。除非我在两个不同的供应商下有重复的发票，否则这将正常工作。有什么建议可以验证供应商和发票吗？另外，我查看了 .+ 但我仍然不明白它在做什么。你能给我更多的说明吗？
我试图解决两者都不匹配的问题，但它没有输出任何东西。请看一看。
$Location = "I:\\Vendors\\.+\\\\Invoices\\.+" $txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.csv" $输出="I:\\Vendors\Missing\Missing2.txt" $DB = import-csv $txt foreach ($Line in $DB) { $First = $Line.kvendnum $Second = $Line.wapinvoice write-host" Vend is: "$First write-host "Inv is: "$Second Write-Host ""} If( ($First -match $Location) -and ($Second -notmatch $Location)) { set-content $Output}
您能否使用文本文件中的一些示例内容更新问题，指出应包含/排除哪些行。