【问题标题】:Parsing several groups out of repeated pattern returning False从返回 False 的重复模式中解析几个组
【发布时间】:2022-01-18 18:20:44
【问题描述】:

我正在尝试从文档中解析几个重复的组(对于每个案例陈述,我需要 stprintf 中的数字(例如 8000),它后面的描述(例如 Comm Err 05 - 超时发送命令) ,以及严重性(警告或致命)。由于某种原因,结果是空的。我看了这个match and nomatch,我想我正在做他们正在做的事情。有人看到我的问题,或者有任何其他建议吗?

#Function to get needed contents from case statements in $parsedCaseMethod provided
Function Get-CaseContents{
  [cmdletbinding()]
  Param ( [string]$parsedCaseMethod, [string]$parseLinesGroupIndicator)
  Process
  {
     Write-Host "This is what I'm dealing with people" -ForegroundColor Green
     Write-Host $parsedCaseMethod
     #parse the case data out:
     #ex from code:
     #case kRESULT_STATUS_PPA_Comm_Timeout:     #_stprintf is parseLinesGroupIndicator              
     #  _stprintf( outDevStr, _T("8005 - (Comm. Err 05) - %s(Timeout sending command)"), errorStr);
     #  outError    = INVALID_PARAM;
     #  outSeverity = CCA_WARNING;
     $regex = [regex]"\((.*)\)" #sdkErr
     $severity = [regex]"[\s\S.=;]*outSeverity[\s\S]*=[\s\Sa-zA-Z]*_(a-zA-Z)*" #severity warning or error etc
     $parsedCaseMethod -match  '$parseLinesGroupIndicator[\s\S]*(?<sdkErr>\d*)[\s\S-]*(?<sdkDesc>$regex)(?<sdkSeverity>$severity)'
     Write-Output "sdkErr:"
     $Matches.sdkErr
     Write-Output "sdkDesc:"
     $Matches.sdkDesc
     Write-Output "sdkSeverity:"
     $Matches.sdkSeverity
  }#End of Process
}#End of Function

#main code
...
#call method to get case info
Get-CaseContents -parsedCaseMethod $matchFound -parseLinesGroupIndicator "_stprintf" #need to get returned info back

$matchFound 内容的示例包括:

...
case kRESULT_STATUS_Undefined_Opcode:                       
            _stprintf( outDevStr, _T("8004 - (Comm. Err 04) - %s(Undefined Opcode)"), errorStr);
            outError    = INVALID_PARAM;
            outSeverity = CCA_WARNING;
            break;

        case kRESULT_STATUS_Comm_Timeout:                       
            _stprintf( outDevStr, _T("8005 - (Comm. Err 05) - %s(Timeout sending command)"), errorStr);
            outError    = INVALID_PARAM;
            outSeverity = CCA_WARNING;
            break;

        case kRESULT_STATUS_TXD_Failed:                     
            _stprintf( outDevStr, _T("8006 - (Comm. Err 06) - %s(TXD Failed--Send buffer overflow.)"), errorStr);
            outError    = INVALID_PARAM;
            outSeverity = CCA_WARNING;
            break;
...

如果找到的三个变量在一个数组中就可以了,所以从函数中返回它们更容易,但我还没有达到这一点。

对于方法 print 语句,它显示如上的字符串内容,然后是以下空输出:

bool 
False
sdkErr:
sdkDesc:
sdkSeverity:

我原以为它最终会返回(我意识到我没有收集这个的数组,但可能是最后一组的信息): sdkErr: 8004, sdkDesc: (Comm. Err 04) - %s(Undefined Opcode), sdkSeverity: 警告

sdkErr: 8005, sdkDesc: (Comm. Err 05) - %s(超时发送命令), sdkSeverity: 警告

sdkErr: 8006, sdkDesc: (Comm. Err 06) - %s(TXD Failed--Send buffer overflow.), sdkSeverity: 警告

这是 PowerShell 5.1。如果有人有任何建议,将不胜感激!

更新: 我尝试更新输入字符串,所以我确切地知道它正在解析什么,并且它只解析一个 case 语句。它仍然没有返回 sdkErr 以及它打印的其他内容。我觉得没问题。我不确定我错过了什么。我正在查看backticks 并决定我需要在我硬编码的子字符串中使用反引号等,因为我在我拉出测试的字符串中得到“输入字符串格式不正确”。在下面的更改中,我将参数重新定义为较小的字符串以进行测试,并修改了我的正则表达式以进行测试,因为我之前的内容不起作用。

#Function to get needed contents from case statements in $parsedCaseMethod provided
Function Get-CaseContents{
  [cmdletbinding()]
  Param ( [string]$parsedCaseMethod, [string]$parseLinesGroupIndicator)
  Process
  {
     #Write-Host "This is what I'm dealing with people" -ForegroundColor Green
     #Write-Host $parsedCaseMethod
     #parse the case data out:
     #ex from code:
     #case kRESULT_STATUS_Comm_Timeout:     #_stprintf is parseLinesGroupIndicator              
     #  _stprintf( outDevStr, _T("8005 - (Comm. Err 05) - %s(Timeout sending command)"), errorStr);
     #  outError    = INVALID_PARAM;
     #  outSeverity = CCA_WARNING;
     $parseCaseMethod = "case kRESULT_STATUS_Comm_Timeout:              
       _stprintf( outDevStr, _T(`"8005 - (Comm. Err 05) - %s(Timeout sending command)`"), errorStr);
       outError = INVALID_PARAM;
       outSeverity  = CCA_WARNING;"
     $regexNum = [regex]"$parseLinesGroupIndicator[\s\S.]*_T[.*](0-9)*"
     $regex = [regex]"\((.*)\)" #sdkErr

     $severity = [regex]"[\s\S.=;]*outSeverity[\s\S]*=[\s\Sa-zA-Z]*_(a-zA-Z)*" #severity warning or error etc
     ##$parsedCaseMethod -match  "$parseLinesGroupIndicator[\s\S]*(?<sdkErr>\d*)[\s\S-]*(?<sdkDesc>$regex)(?<sdkSeverity>$severity)"
     $parsedCaseMethod -match  "$regexNum(?<sdkErr>\d*)[\s\S-]*(?<sdkDesc>$regex)(?<sdkSeverity>$severity)"
     Write-Output "sdkErr:"
     $Matches.sdkErr
     Write-Output "sdkDesc:"
     $Matches.sdkDesc
     Write-Output "sdkSeverity:"
     $Matches.sdkSeverity
  }#End of Process
}#End of Function 

更新 2: 我正在玩一个正则表达式编辑器regex101.com,它与我为 $regexNum 显示的内容相匹配,但由于某种原因,当我打印 $Matches.sdkErr 时它没有返回我所期望的内容。我不确定我是否因为编辑器将 8000 部分显示为一个组,并且有不同的方式来获取它。我尝试了 $Matches.sdkErr.Group(1) 但得到了这个错误

Method invocation failed because [System.String] does not contain a method named 'Group'.

这是代码更改(大部分更改在 $regexNum 中):

#Function to get needed contents from case statements in $parsedCaseMethod provided
Function Get-CaseContents{
  [cmdletbinding()]
  Param ( [string]$parsedCaseMethod, [string]$parseLinesGroupIndicator)
  Process
  {
     #Write-Host "This is what I'm dealing with people" -ForegroundColor Green
     #Write-Host $parsedCaseMethod
     #parse the case data out:
     #ex from code:
     #case kRESULT_STATUS_PPA_Comm_Timeout:     #_stprintf is parseLinesGroupIndicator              
     #  _stprintf( outDevStr, _T("8005 - (Comm. Err 05) - %s(Timeout sending command)"), errorStr);
     #  outError    = INVALID_PARAM;
     #  outSeverity = CCA_WARNING;
     $parsedCaseMethod = "case kRESULT_STATUS_Comm_Timeout:             
       _stprintf( outDevStr, _T(`"8005 - (Comm. Err 05) - %s(Timeout sending command)`"), errorStr);
       outError = INVALID_PARAM;
       outSeverity  = CCA_WARNING;"
     ##$regexNum = [regex]"$parseLinesGroupIndicator[\s\Sa-zA-Z]*(0-9)*" 
     $regexNum = [regex]"$parseLinesGroupIndicator[\s\Sa-zA-Z]*_T[^[0-9]]*.+?([0-9][0-9]*)"
     # \s\S\(`",a-zA-Z.]*_T[.*](0-9)*
     $regex = [regex]"\((.*)\)" #sdkErr

     $severity = [regex]"[\s\S.=;]*outSeverity[\s\S]*=[\s\Sa-zA-Z]*_(a-zA-Z)*" #severity warning or error etc
     #$parsedCaseMethod -match  "$parseLinesGroupIndicator[\s\S]*(?<sdkErr>\d*)[\s\S-]*(?<sdkDesc>$regex)(?<sdkSeverity>$severity)"
     #$parsedCaseMethod -match  "$regexNum(?<sdkErr>\d*)[\s\S-]*(?<sdkDesc>$regex)(?<sdkSeverity>$severity)"
     $parsedCaseMethod -match  "(?<sdkErr>$regexNum)[\s\S-]*(?<sdkDesc>$regex)(?<sdkSeverity>$severity)"
     Write-Output "sdkErr:"
     $Matches.sdkErr.Group(1)  #error message...with $Matches.sdkErr it prints entire Match and not just the part in parenthesis that I want (8000)
     Write-Output "sdkDesc:"
     $Matches.sdkDesc
     Write-Output "sdkSeverity:"
     $Matches.sdkSeverity
  }#End of Process
}#End of Function

我也在看这个string in text file

【问题讨论】:

  • $parsedCaseMethod -match '...' 行上有单引号,因此变量不会被扩展。如果将它们换成双引号,该功能似乎可以正常工作。正则表达式本身可能需要一些工作,但我不确定您要从每个匹配组中寻找什么结果。
  • @Cpt.Whale 带有双引号,它为 sdkErr 返回空白,我正在尝试使用上面的 Update 修复它,但仍然找不到 sdkErr。
  • @Cpt.Whale 切换到双引号是个好主意。这确实有点帮助,但它仍然没有像 8000 一样返回 sdkErr。
  • 这真的是minimal reproducible example吗? ..

标签: regex powershell


【解决方案1】:

您的正则表达式中有一些不正确的语法,请参阅下面有关建议替换的注释。我也更喜欢使用Select-String 而不是-match,因为它可以返回多个匹配-AllMatches

# construct regex, previously done with variables
$fullregex = [regex]"_stprintf[\s\S]*?_T\D*", # Start of error message, capture until digits
    "(?<sdkErr>\d+)",       # Error number, digits only
    "\D[\s\S]*?",           # match anything, non-greedy
    "(?<sdkDesc>\((.*)\))", # Error description, anything within parentheses
    "[\s\S]*?",             # match anything, non-greedy
    "(?<sdkSeverity>outSeverity\s*=\s[a-zA-Z_]*)", # Capture severity string
    '' -join ''

# run the regex
$Values = $parsedCaseMethod | Select-String -Pattern $fullregex -AllMatches

# Convert Name-Value pairs to object properties
$result = foreach ($match in $Values.Matches){
  [PSCustomObject][ordered]@{
    sdkErr      = $match.Groups['sdkErr']
    sdkDesc     = $match.Groups['sdkDesc']
    sdkSeverity = $match.Groups['sdkSeverity']
  }
}

$result
sdkErr sdkSeverity               sdkDesc                                                            
------ -----------               -------                                                            
8004   outSeverity = CCA_WARNING (Comm. Err 04) - %s(Undefined Opcode)"), errorStr)                 
8005   outSeverity = CCA_WARNING (Comm. Err 05) - %s(Timeout sending command)"), errorStr)          
8006   outSeverity = CCA_WARNING (Comm. Err 06) - %s(TXD Failed--Send buffer overflow.)"), errorStr)

我对您要查找的内容做了很多假设,但这可能会有所帮助

【讨论】:

  • 知道如何获得警告而不是 CCA_WARNING 吗?我原以为会是这样,但它似乎永远在处理 sdkSeverity 行的更改并在其后添加一行.... "(?outSeverity\s*=\s[a-zA-Z_ ]*?)", # 捕获严重性字符串,非贪婪 "(\s[a-zA-Z]*)", #捕获_之后的内容
  • 我做了这样的改变: sdkSeverity = ($match.Groups['sdkSeverity'] -split '_')[-1] 以及 "(?((.+ ?)))", # 错误描述,括号内的任何内容,非贪婪,但除此之外这是一个很好的解决方案。谢谢!
  • 看来我也需要等号右边的outError。我正在尝试这个,但它会永远处理。我测试了我添加的单个正则表达式,它正在找到它。有什么想法吗? # 构造正则表达式 $fullregex = [regex]"stprintf[\s\S]*?_T\D*", "(?\d+)", "\D[\s\S]*? ", "(?((.+?)))", "[\s\S]*?", "(outError\s*=(?\s[a-zA-Z]*))", ####### "(?outSeverity\s*=\s[a-zA-Z_]*)", ...然后将其添加到结果部分sdkOutErr = $match.Groups['sdkOutErr']
  • 我想通了... stprintf[\s\S]*?_T\D*(?\d+)\D[\s\S]*?( ?((.+?)))([\s\S]*?outError\s*=(?\s[a-zA-Z]*))[\s \S]*?(?outSeverity\s*=\s[a-zA
【解决方案2】:

我不太确定您是希望输出是消息字符串还是希望将解析后的信息作为对象数组,所以在下面,它两者兼而有之:

$matchFound = @"
case kRESULT_STATUS_Undefined_Opcode:                       
            _stprintf( outDevStr, _T("8004 - (Comm. Err 04) - %s(Undefined Opcode)"), errorStr);
            outError    = INVALID_PARAM;
            outSeverity = CCA_WARNING;
            break;

        case kRESULT_STATUS_Comm_Timeout:                       
            _stprintf( outDevStr, _T("8005 - (Comm. Err 05) - %s(Timeout sending command)"), errorStr);
            outError    = INVALID_PARAM;
            outSeverity = CCA_WARNING;
            break;

        case kRESULT_STATUS_TXD_Failed:                     
            _stprintf( outDevStr, _T("8006 - (Comm. Err 06) - %s(TXD Failed--Send buffer overflow.)"), errorStr);
            outError    = INVALID_PARAM;
            outSeverity = CCA_WARNING;
            break;
"@ -split '\r?\n'

$inCase = $false  # a flag telling us if we're inside a 'case' or not
$result = switch -Regex ($matchFound) {
    '^\s*case kRESULT' { 
        $inCase = $true 
        # create an object with for now null values in its properties
        $out = [PsCustomObject]@{ sdkErr = $nul; sdkDesc = $null; sdkSeverity = $null }
    }
    '^\s*_stprintf\(\s*outDevStr, _T\("(\d+) - (\(.+\))"'  { 
        if ($inCase) {
            $out.sdkErr  = $matches[1]
            $out.sdkDesc = $matches[2]
        }
    }
    '^\s*outSeverity = (\w+)' {
        if ($inCase) {
            $out.sdkSeverity = ($matches[1] -split '_')[-1]
            # now we have all info, output a string to the console
            Write-Host ('sdkErr: {0}, sdkDesc: {1}, sdkSeverity: {2}' -f $out.sdkErr, $out.sdkDesc, $out.sdkSeverity)
            # and output the completed object to be collected in $result
            $out
            # reset the flag so we can rebuild the object for the next 'case kRESULT'
            $inCase = $false
        }
    }
}

屏幕上的输出:

sdkErr: 8004, sdkDesc: (Comm. Err 04) - %s(Undefined Opcode), sdkSeverity: WARNING
sdkErr: 8005, sdkDesc: (Comm. Err 05) - %s(Timeout sending command), sdkSeverity: WARNING
sdkErr: 8006, sdkDesc: (Comm. Err 06) - %s(TXD Failed--Send buffer overflow.), sdkSeverity: WARNING

在变量$result 中捕获的输出:

$result | Format-Table -AutoSize
sdkErr sdkDesc                                                sdkSeverity
------ -------                                                -----------
8004   (Comm. Err 04) - %s(Undefined Opcode)                  WARNING    
8005   (Comm. Err 05) - %s(Timeout sending command)           WARNING    
8006   (Comm. Err 06) - %s(TXD Failed--Send buffer overflow.) WARNING

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-04-13
    • 2015-09-10
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多