【问题标题】:Parsing text type log file with values surrounded by double quotes and separated by comma解析文本类型日志文件,其值用双引号括起来并用逗号分隔
【发布时间】:2014-04-01 15:02:33
【问题描述】:

我有这个日志文件,我正在尝试解析它。 问题是数据行的格式是“value”,“value”,......有时是“value\”value\””,......

#basepath  D:\XHostMachine\Results
#results   test.res
#fields    TestPlan Script TestCase TestData ErrorCount ErrorText DateTime Elapsed
#delimiter , 
#quote     " \

"D:\XHostMachine\plans\test.pln","D:\XHostMachine\testcases\test.t","rt1","1,\"a\"",1,"[#ERROR#][APPS-EUAUTO1] [error] rt1 t1 ( Screen shot : D:\XTestMachines\Error\[APPS-EUAUTO1] 03-28-14 11-29-22.png)","2014-03-28 11.29.04","0:00:18"
"D:\XHostMachine\plans\test.pln","D:\XHostMachine\testcases\test.t","rt2","1,\"a\"",0,"","2014-03-28 11.29.22","0:00:08"

但我无法使用"," 作为分隔符来拆分行(因为, 可能存在于其中)

我的代码是:

Function Get-RexLog {
Param ($File)
# Reads the log file into memory.
    Try {
        Get-Content -path $File -ErrorAction Stop | select -skip 6 # skips the first 6 lines
    } Catch {
        Write-Error "The data file is not present" 
        BREAK
    }
} # End: Function Get-RexLog

# -----------------------------------------------------------------------

Function Get-Testplan {
Param ($RexLog)
    for ($i=0; $i -lt $RexLog.Count; $i++) {
        $Testcase = $RexLog[$i].Split("`"[,]`"") | ForEach-Object - process {$_.TrimStart('"')}
        $Output = New-Object PSobject -Property @{
            TestPlan   = $Testcase[0]
            Script     = $Testcase[1]
            TestCase   = $Testcase[2]
            TestData   = $Testcase[3]
            ErrorCount = $Testcase[4]
            ErrorText  = $Testcase[5]
            DateTime   = $Testcase[6]
            Elapsed    = $Testcase[7]
        }
    }
} # End: Function Get-Testplan

# -----------------------------------------------------------------------

# Parse the files
$RexLog = Get-RexLog -file "D:\XHostMachine\Results\test.rex"
$Testplan = Get-Testplan -RexLog $RexLog
$Testplan

最终编辑: 使用 ConvertFrom-Csv

ConvertFrom-Csv -inputobject $RexLog -Header @("TestPlan","Script","TestCase","TestData","ErrorCount","ErrorText","DateTime","Elapsed")

【问题讨论】:

  • 这可以通过正则表达式轻松完成。您想使用正则表达式来拆分吗?
  • @sln 使用正则表达式不会有问题,但我不知道如何实现。你能给我一个建议吗?谢谢
  • 我不知道 Powershell 正则表达式函数调用,但我可以给你正则表达式。
  • 这种格式听起来很像 CSV 文件。您是否尝试过通过 Import-CSV cmdlet 传递它? (它旨在处理“数据”周围的双引号并仍然处理逗号。请参阅下面的@Kayasax's answer

标签: regex parsing powershell logfile


【解决方案1】:

powershell 可以使用import-csv cmdlet 轻松处理逗号分隔值文本文件 (csv)。

看:

PS C:\temp> Import-Csv c:\temp\test.csv -Header @("TestPlan","Script","TestCase","TestData","ErrorCount","ErrorText","Da
teTime","Elapsed")


TestPlan   : D:\XHostMachine\plans\test.pln
Script     : D:\XHostMachine\testcases\test.t
TestCase   : rt1
TestData   : 1,\a\""
ErrorCount : 1
ErrorText  : [#ERROR#][APPS-EUAUTO1] [error] rt1 t1 ( Screen shot : D:\XTestMachines\Error\[APPS-EUAUTO1] 03-28-14
             11-29-22.png)
DateTime   : 2014-03-28 11.29.04
Elapsed    : 0:00:18

TestPlan   : D:\XHostMachine\plans\test.pln
Script     : D:\XHostMachine\testcases\test.t
TestCase   : rt2
TestData   : 1,\a\""
ErrorCount : 0
ErrorText  :
DateTime   : 2014-03-28 11.29.22
Elapsed    : 0:00:08

【讨论】:

  • 很好的答案,除了 TestData 应该转换为 1,"a" 而不是 1,\a\""
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2013-04-19
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多