【问题标题】:Powershell regex reading multiple linesPowershell正则表达式读取多行
【发布时间】:2020-09-16 15:15:59
【问题描述】:

我正在尝试使用正则表达式读取文件并匹配多行,但遇到了一些问题。我正在尝试读取的文件如下所示:

I 09/07/20 05:55PM [Backup Set] Starting backup to CrashPlan Central: 122 files (93.30MB) to back up
I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
I 09/07/20 06:00PM  - Unable to backup 1 file (next attempt within 15 minutes)
I 09/07/20 06:15PM [Backup Set] Starting backup to CrashPlan Central: 27 files (250MB) to back up
I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
I 09/07/20 06:34PM [Backup Set] Starting backup to CrashPlan Central: 18 files (169.30KB) to back up

行似乎以CR LF 结尾。最终,我想找到包含“已完成备份到”的每一行,而不是立即包含“无法备份”的行。但是,即使是最简单的查询,我也遇到了麻烦。

这是我在文本中提取的方式:

PS C:\temp> $rawtext = Get-Content '.\new 1.txt' -raw

PS C:\temp> $rawtext.GetType()

IsPublic IsSerial Name                 BaseType
-------- -------- ----                 --------
True     True     String               System.Object


PS C:\temp> $rawtext | Measure-Object -Line

Lines Words Characters Property
----- ----- ---------- --------
    6                       

以及一些简单的正则表达式查询的结果:

PS C:\temp> Select-String -InputObject $rawtext -pattern '^.*Completed.*$' # returns nothing
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?m)^.*Completed.*$' # returns the entire contents of $rawtext
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*$' # also returns the entire contents of $rawtext
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*\r\n$' # returns nothing
PS C:\temp> Select-String -InputObject $rawtext -pattern '(?ms)^.*Completed.*\r\n' # returns the entire contents of $rawtext

我希望这些查询中至少有一个会返回包含“已完成”的每一行。但显然 Powershell 并没有像我想象的那样处理多行。任何人都可以阐明如何在 Powershell 中处理多行正则表达式吗?

FWIW,以下命令在 OSX 终端中成功获取了我想要的内容,并且本质上是我想在 PoSH 中复制的内容:

completedBackups=$(sed '/Completed[[:space:]]backup[[:space:]]to/!d;$!N;/\n.*Unable[[:space:]]to[[:space:]]backup[[:space:]]/!P;D' $f)

【问题讨论】:

    标签: powershell


    【解决方案1】:

    您可以执行以下操作:

    $rawtext = Get-Content '.\new 1.txt' -Raw
    $rawtext | Select-String -Pattern '(?m)^.*?Completed backup to.*$(?!\r?\n.*Unable to backup)' -AllMatches |
        Foreach-Object {$_.Matches.Value}
    

    说明:

    (?m)是多行模式,允许^$匹配每一行的开头和结尾。

    (?!) 是不消耗任何字符的负前瞻。因此,我们从字符串 $ 的末尾向前看,找不到零个或多个回车符 \r? 和换行符 \n 后跟任何字符 .* (在一行上,因为我们没有使用(?s)) 和unable to backup

    -AllMatches 开关指示命令在第一次成功匹配后保持匹配。

    使用-Raw 开关很好,因为它可以让我们轻松查看下一行文本。如果没有-Raw,我们将需要跟踪先前通过管道传输到Select-String 的行。这是可行的,但方法不同。

    (?s) 或单行模式在此处使用. 匹配字符时会出现一些问题。 . 将在单行模式下匹配换行符。

    由于Select-String 返回MatchInfo 对象,您需要访问对象的Matches 属性的Value 属性以获取实际匹配的行。

    【讨论】:

      【解决方案2】:

      为什么不这样做...

      # Create the data file
      '
      I 09/07/20 05:55PM [Backup Set] Starting backup to CrashPlan Central: 122 files (93.30MB) to back up
      I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
      I 09/07/20 06:00PM  - Unable to backup 1 file (next attempt within 15 minutes)
      I 09/07/20 06:15PM [Backup Set] Starting backup to CrashPlan Central: 27 files (250MB) to back up
      I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
      I 09/07/20 06:34PM [Backup Set] Starting backup to CrashPlan Central: 18 files (169.30KB) to back up
      ' | 
      Out-File -FilePath 'D:\Temp\BackUpLog.txt'
      
      
      (Get-Content -Path 'D:\temp\BackUpLog.txt').GetType()
      # Results
      <#
      IsPublic IsSerial Name                                     BaseType                                                                                                
      -------- -------- ----                                     --------                                                                                                
      True     True     Object[]                                 System.Array  
      #>
      ((Get-Content -Path 'D:\temp\BackUpLog.txt') | 
      Measure-Object -Line).Lines
      # Results
      <#
      6
      #>
      
      
      (Get-Content -Path 'D:\temp\BackUpLog.txt' -Raw).GetType()
      # Results
      <#
      IsPublic IsSerial Name                                     BaseType                                                                                                
      -------- -------- ----                                     --------                                                                                                
      True     True     String                                   System.Object 
      #>
      
      
      ((Get-Content -Path 'D:\temp\BackUpLog.txt' -Raw) | 
      Measure-Object -Line).Lines
      # Results
      <#
      6
      #>
      
      # Use Select-String with pattern and -AllMatches
      (Get-Content -Path 'D:\temp\BackUpLog.txt').Split([Environment]::NewLine) | 
      Select-String -Pattern 'Completed backup to' -AllMatches
      # Use RegEx matches to collect specific strings
      (Get-Content -Path 'D:\temp\BackUpLog.txt').Split([Environment]::NewLine) -match 'Completed backup to'
      # Results of both are
      <#
      I 09/07/20 06:00PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:39s: 147 files (197.90MB) backed up, 5.30MB encrypted and sent @ 323.5Kbps (Effective rate: 2.7Mbps)
      I 09/07/20 06:19PM [Backup Set] Completed backup to CrashPlan Central in 0h:04m:03s: 28 files (250MB) backed up, 5MB encrypted and sent @ 302.5Kbps (Effective rate: 4.3Mbps)
      #>
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2014-03-21
        • 1970-01-01
        • 1970-01-01
        • 2012-11-17
        相关资源
        最近更新 更多