【发布时间】:2021-10-13 17:20:03
【问题描述】:
我正在尝试从具有其他数据的文本文件中捕获特定的键值对,而不是使用 powershell 的键:值模式。谁能帮我吗?到目前为止,我已经在 Internet 的帮助下尝试了代码,因为我是 Powershell 的新手。任何帮助将不胜感激。
源文本示例:
ResourceGroupName : DataLake-Gen2
DataFactoryName : dna-production-gen2
TriggerName : TRG_RP_Optimizely_Import
TriggerRunId : 08586050680855766354964895535CU57
TriggerType : ScheduleTrigger
TriggerRunTimestamp : 8/4/2020 10:59:59 AM
Status : Succeeded
TriggeredPipelines : {[PL_DATA_OPTIMIZELY_MART, 1f89fc3a-27b5-442e-9685-a444f751f607]}
Message :
Properties : {[TriggerTime, 8/4/2020 10:59:59 AM], [ScheduleTime, 8/4/2020 11:00:00 AM], [triggerObject, {
"name": "Trigger_421B8CAF-BE66-42CF-83DA-E3028693F304",
"startTime": "2020-08-04T10:59:59.8982174Z",
"endTime": "2020-08-04T10:59:59.8982174Z",
"scheduledTime": "2020-08-04T11:00:00Z",
"trackingId": "fdf58bb2-ecd5-4fe9-b2ef-d94fd71729c3",
"clientTrackingId": "08586050680855766354964895535CU57",
"originHistoryName": "08586050680855766354964895535CU57",
"code": "OK",
"status": "Succeeded"
}]}
AdditionalProperties : {[groupId, 08586050680855766354964895535CU57]}
ResourceGroupName : DataLake-Gen2
DataFactoryName : dna-production-gen2
TriggerName : TRG_RP_Optimizely_Import
TriggerRunId : 08586049816852049265494275953CU24
TriggerType : ScheduleTrigger
TriggerRunTimestamp : 8/5/2020 11:00:00 AM
Status : Succeeded
TriggeredPipelines : {[PL_DATA_OPTIMIZELY_MART, dd6b5beb-b7f6-44ef-8903-34c845003dfc]}
Message :
Properties : {[TriggerTime, 8/5/2020 11:00:00 AM], [ScheduleTime, 8/5/2020 11:00:00 AM], [triggerObject, {
"name": "Trigger_421B8CAF-BE66-42CF-83DA-E3028693F304",
"startTime": "2020-08-05T11:00:00.2662252Z",
"endTime": "2020-08-05T11:00:00.2662252Z",
"scheduledTime": "2020-08-05T11:00:00Z",
"trackingId": "ba223bbd-8cb2-40e8-951f-87130dbbbfe8",
"clientTrackingId": "08586049816852049265494275953CU24",
"originHistoryName": "08586049816852049265494275953CU24",
"code": "OK",
"status": "Succeeded"
}]}
AdditionalProperties : {[groupId, 08586049816852049265494275953CU24]}
目前使用的代码:
[CmdletBinding()]
Param(
[Parameter(Mandatory=$true)]
$path
)
function Format-LogFile {
[CmdletBinding()]
param (
$log
)
$targets = 'TriggerRunTimestamp','ResourceGroupName', 'DataFactoryName', 'TriggerName', 'TriggerRunId', 'TriggerType', 'Status'
[System.Collections.ArrayList]$lines = @()
$log | ForEach-Object {
$line = $_
$targets | ForEach-Object {
if ($line.Contains($_) -and $line -notin $lines) {
$lines.Add($line) | Out-Null
}
}
}
# $lines[0] = $lines[0].TrimStart("JournalSMS ")
# return $lines
}
function Get-LogFields {
[CmdletBinding()]
param (
$lines
)
$targets = 'TriggerRunTimestamp','ResourceGroupName', 'DataFactoryName', 'TriggerName', 'TriggerRunId', 'TriggerType', 'Status'
$matchs = $lines | Select-String -Pattern "(?<=(\s||\b))[A-Z][\s\[A-Z]/]+?\s*?\:\s+[^\s\b]+" -AllMatches
$dict = @{}
$matchs.Matches | ForEach-Object {
$val = $_.Value
$arr = $val.Split("")
if ($arr[0].Trim() -in $targets) {
$dict.Add($arr[0].Trim(), $arr[1].Trim())
}
}
return $dict
}
$log = get-content 'D:\\output.txt'
$path = "D:\\output.txt"
$info = Get-ChildItem -File -Recurse -Path $path | ForEach-Object {
$log = Get-Content $_.FullName -Encoding Default
$lines = Format-LogFile $log
$dict = Get-LogFields $lines
$values = New-Object -TypeName psobject -Property $dict
return $values
}
# $info |
# Select-Object @{name='TriggerRunTimestamp';expression={$_.'TriggerRunTimestamp'}},
# @{name='ResourceGroupName';expression={$_."ResourceGroupName"}},
# @{name='DataFactoryName';expression={$_.'DataFactoryName'}},
# @{name='TriggerName';expression={$_.'TriggerName'}},
# @{name='TriggerRunId';expression={$_.'TriggerRunId'}}
# @{name='TriggerType';expression={$_.'TriggerType'}}
# @{name='Status';expression={$_.'Status'}}|
# Export-Csv -Encoding UTF8 -Path .\result.csv -Force
$info |
Select-Object 'TriggerRunTimestamp', "ResourceGroupName", 'DataFactoryName',
'TriggerName', 'TriggerRunId', 'TriggerType', 'Status' |
ConvertTo-CSV -Delimiter ";" -NoTypeInformation |
% {$_.Replace('"','')} |
Set-Content -Path 'D:\\result.csv' -Force
# Export-Csv -Encoding UTF8 -Path .\result.csv -Force
预期输出:
TriggerRunTimestamp ResourceGroupName DataFactoryName TriggerName TriggerRunId TriggerType 状态 TriggeredPipeline Properties_TriggerTime Properties_ScheduleTime triggerObject_name triggerObject_startTime triggerObject_endTime triggerObject_scheduledTime 8/4/2020 10:59 DataLake-Gen2 dna-production-gen2 TRG_RP_Optimizely_Import 08586050680855766354964895535CU57 ScheduleTrigger Succeeded PL_DATA_OPTIMIZELY_MART 8/4/2020 10:59 8/4/2020 11:00 Trigger_421B8CAF-BE66-42CF-83DA-E3028693F304 2020-08- 04T10:59:59.8982174Z 2020-08-04T10:59:59.8982174Z 2020-08-04T11:00:00Z
注意:粗体值是列标题,值是纯文本。
急需帮助!!
谢谢
【问题讨论】:
-
我的工作重点是 PSCustomObject、带有 UTF 编码的 Get-Content、哈希表和数组。但非常感谢一些指导我正在尝试做的事情 - 递归循环输入源文本文件 - 如代码中所述,从文本文件中提取键值对。 - 并将它们导出为单个 CSV 文件中的列。
标签: arrays json powershell key-value