【问题标题】:Find and verify path strings in text file using PowerShell, RegEx search使用 PowerShell、RegEx 搜索在文本文件中查找和验证路径字符串
【发布时间】:2014-01-28 20:23:49
【问题描述】:

第一次在这里发帖,我会尽量清楚和详细,但如果我在搜索这些板时错过了现有答案,请保持温和。

首先,问题:

  1. 如何排除包含特定关键字(“fastcopy”)的 RegEx 响应
  2. 如何包含不以文件名/通配符结尾的路径结果

我正在处理一组与批处理文件非常相似的文本文件。它们是纯文本,包含标题行、包含服务器上文件路径的行和注释行。注释行以分号 (;) 开头,因此很简单,可以排除。路径都应该以变量 %INSTDIR% 开头,但它们可能有也可能没有路径周围的引号,并且它们可能有也可能没有路径后面的执行选项。最后一点...该公司使用 FastCopy.exe 从网络中转储文件/文件夹,在这一行中,我想返回正在复制的文件夹/文件,而不是包含 fastcopy.exe 的路径。

这是一个示例(有点大以显示潜在问题):

[Installing .NET 3.5 Hotfix KB943326 for App1]
; *** Added NET 3.5 SP1 hotfix KB943326: resolves App1 hidden menus force laptop re-booting
1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\.NET_3.5_Hotfix_KB943326\WindowsXP-KB943326-x86-ENU.exe /quiet /norestart

[Installing Agent 5.3.1]
1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\AGenT_531_2.0\w7wxp_ze_20\install.exe

[Installing APR Manager 2.1]
1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\APRManager_21_Updated_2.0\wviwxp_ze_20\install.exe

[Installing Scope Simulator]
1 = MD "C:\Temp\scope_simulator_10"
2 =  start /wait /high %INSTDIR%\ToolShare$\Site_Toolbox\Custom_Scripts\Source\fastcopy.exe /auto_close /no_confirm_del /no_confirm_stop /log=FALSE /open_window /force_start /force_close /stream=FALSE /cmd=diff "%INSTDIR%\ToolShare$\Sample_Toolbox\applications\scope_simulator_10" /to="C:\Temp\scope_simulator_10"
3 = "C:\Temp\scope_simulator_10\w7wxp_ze_10\Install.exe"
4 = RD "C:\temp\scope_simulator_10" /q /s

[Installing Log Analyzer Offline 2.6.1]
1 = %INSTDIR%\ToolShare$\Sample_Toolbox\applications\Log_Analyzer_Offline_261\wxp_ze_10\install.exe

[Installing Data Migration Script]
1 = MD "C:\Temp\Data Migration"
2 = xcopy "%INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\*.*" "C:\Temp\Data Migration" /y /e
3 = xcopy "%INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\Data Migration.lnk" C:\DOCUME~1\ALLUSE~1\Desktop\ /Y

我已将其设置为拉出一个“dir \\UNCPath\*.ini”,然后循环执行 ForEach ($INI in $Results) 位。我一直在循环内尝试从每一行中提取路径的行是:

gc $ini|?{!($_ -match "^;") -and ($_ -match "%INST[^`"]*?\\.*(\.\w{3}|\.\*)(?=`"|\s|\Z)")}|%{$TestPath = $Matches[0].replace("%INSTDIR%","\\ServerName1");if(test-path $testpath){write-host "  [OK]    " -foregroundcolor Green -NoNewline}else{write-host "[Missing] " -ForegroundColor red -NoNewline};write-host "$testpath"}

这几乎得到了我想要的一切。它不做的是得到任何不以 .* 或标准 3 字符扩展名(.exe、.cmd、.jar 等)结尾的东西。另外,它会回退 fastcopy 路径,而不是尝试复制的路径。

我想要的结果:

%INSTDIR%\ToolShare$\Sample_Toolbox\applications\.NET_3.5_Hotfix_KB943326\WindowsXP-KB943326-x86-ENU.exe
%INSTDIR%\ToolShare$\Sample_Toolbox\applications\AGenT_531_2.0\w7wxp_ze_20\install.exe
%INSTDIR%\ToolShare$\Sample_Toolbox\applications\APRManager_21_Updated_2.0\wviwxp_ze_20\install.exe
%INSTDIR%\ToolShare$\Sample_Toolbox\applications\scope_simulator_10
%INSTDIR%\ToolShare$\Sample_Toolbox\applications\Log_Analyzer_Offline_261\wxp_ze_10\install.exe
%INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\*.*
%INSTDIR%\ToolShare$\Sample_Toolbox\Support\Data Migration\Data Migration.lnk

我没有得到第二个结果(相反,我得到了 FastCopy 路径,但即使我从行中剥离 Fastcopy 并且只有所需的路径,它也不会返回它)。欢迎提出任何建议。

【问题讨论】:

    标签: regex powershell


    【解决方案1】:

    以下脚本应该可以正常工作。

    $paths = Get-Content $ini | Foreach {
        if ($_ -match "^(?=[^;]).*?(?<delimiter>[""' ])(?<path>%INSTDIR%(?!.*?fastcopy.exe).*?)(?:\1|$)")
        {
            Write-Output $Matches["path"]
        }
    }
    

    $paths 变量现在将包含所有请求的路径。请注意,如果任何字符串在路径中的任何位置包含“fastcopy.exe”文字字符串,则此正则表达式将找不到它。

    尝试解释正则表达式:

    ^ - match the start of the line
    (?=[^;]) - positive lookahead verifying that the line does not start with a semicolon
    .*? - any character, as few as possible (to remove all characters before the path we want to match)
    (?<delimiter>["' ]) - named group verifying whether the path is surrounded by space, a quotation character or a apostrophe.
    (?<path> - start a named capturing group for capturing the "path"
        %INSTDIR% - matches the literal string '%INSTDIR%'
        (?!.*?fastcopy.exe) - negative lookahead verifying that the part of the line we're trying to match (which has started with %INSTDIR%) doesn't contain the word fastcopy.exe anywhere later in the string (the second time the %INSTDIR% occurs on the fastcopy line, the rest of the line does not contain the fastcopy.exe literal string).
        .*? - matches any character, as few as possible, to make sure that we stop as soon as we find a matching delimiter character below
    ) - ends the named capturing group "path"
    (?:\1|$) - matches (in a non-capturing group) the character found by the delimiter group above (to match a quotation character, apostrophe or space, depending on what character was immediately before the %INSTDIR% literal string), or the end of the line.
    

    如果有任何不清楚的地方,请在下方添加评论要求澄清。

    【讨论】:

    • 谢谢,您建议的代码完美运行。另外,我想我现在理解了负面的前瞻性,这对我未来很有帮助。
    猜你喜欢
    • 2019-01-28
    • 2018-05-11
    • 1970-01-01
    • 2015-08-15
    • 2021-10-18
    • 1970-01-01
    • 1970-01-01
    • 2011-04-17
    • 2020-05-16
    相关资源
    最近更新 更多