【发布时间】:2018-05-13 20:12:28
【问题描述】:
我一直在尝试解析以下格式的 ASCII 文本文件 --
0 0 0x2de0 [0x98]: PERF_RECORD_MMAP -1/0: [0xffffffffc06ae000(0x5000) @ 0]: x /lib/modules/4.4.0-83-generic/kernel/net/ipv4/netfilter/nf_reject_ipv4.ko
0x2e78 [0x90]: event: 1
.
. ... raw event: size 144 bytes
. 0000: 01 00 00 00 01 00 90 00 ff ff ff ff 00 00 00 00 ................
. 0010: 00 30 6b c0 ff ff ff ff 00 50 00 00 00 00 00 00 .0k......P......
. 0020: 00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64 ......../lib/mod
. 0030: 75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65 ules/4.4.0-83-ge
. 0040: 6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 6e 65 74 neric/kernel/net
. 0050: 2f 69 70 76 34 2f 6e 65 74 66 69 6c 74 65 72 2f /ipv4/netfilter/
. 0060: 69 70 74 5f 52 45 4a 45 43 54 2e 6b 6f 00 2e 6b ipt_REJECT.ko..k
. 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0 0 0x2e78 [0x90]: PERF_RECORD_MMAP -1/0: [0xffffffffc06b3000(0x5000) @ 0]: x /lib/modules/4.4.0-83-generic/kernel/net/ipv4/netfilter/ipt_REJECT.ko
0x2f08 [0x88]: event: 1
.
. ... raw event: size 136 bytes
. 0000: 01 00 00 00 01 00 88 00 ff ff ff ff 00 00 00 00 ................
. 0010: 00 80 6b c0 ff ff ff ff 00 50 00 00 00 00 00 00 ..k......P......
. 0020: 00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64 ......../lib/mod
. 0030: 75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65 ules/4.4.0-83-ge
. 0040: 6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 6e 65 74 neric/kernel/net
. 0050: 2f 6e 65 74 66 69 6c 74 65 72 2f 78 74 5f 74 63 /netfilter/xt_tc
. 0060: 70 75 64 70 2e 6b 6f 00 00 00 00 00 00 00 00 00 pudp.ko.........
. 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0080: 00 00 00 00 00 00 00 00
........[some other data]........
0x11590 [0x30]: PERF_RECORD_AUXTRACE size: 0x2002a0 offset: 0 ref: 0x2d44e6441a3c2 idx: 0 tid: -1 cpu: 0
.
. ... Intel Processor Trace data: size 2097824 bytes
. 00000000: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
. 00000010: 00 00 00 PAD
. 00000013: 99 20 MODE.TSX TXAbort:0 InTX:0
. 00000015: 99 01 MODE.Exec 64
. 00000017: 7d 08 45 06 81 ff ff 00 FUP 0xffff81064508
. 0000001f: 00 00 00 00 00 00 00 PAD
. 00000026: 02 43 00 76 49 1f 00 00 PIP 0xfa4bb00 (NR=0)
. 0000002e: 00 00 00 00 00 00 00 00 PAD
--- continued ---
该文件将有几个标题 - 正如您在我的 sn-p 中看到的那样。
PERF_RECORD_MMAP 和 PERF_RECORD_AUXTRACE
文件中还会有其他标题。
我想要的是我的文本文件中所有具有PERF_RECORD_AUXTRACE 的标题都应该被考虑。我的文件中PERF_RECORD_AUXTRACE 后面的所有数据都应该只收集(即所有以英特尔处理器跟踪数据开头的数据)。 PERF_RECORD_AUXTRACE 标头还有一个 size 字段,我可以使用它指定要在 PERF_RECORD_AUXTRACE 标头中收集的数据量。
编辑#1:
所以基本上,给定上述输入文件 sn-p,我希望输出为以下形式(记录后的所有行都包含 PERF_RECORD_AUXTRACE)...
.
. ... Intel Processor Trace data: size 2097824 bytes
. 00000000: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
. 00000010: 00 00 00 PAD
. 00000013: 99 20 MODE.TSX TXAbort:0 InTX:0
. 00000015: 99 01 MODE.Exec 64
. 00000017: 7d 08 45 06 81 ff ff 00 FUP 0xffff81064508
. 0000001f: 00 00 00 00 00 00 00 PAD
. 00000026: 02 43 00 76 49 1f 00 00 PIP 0xfa4bb00 (NR=0)
. 0000002e: 00 00 00 00 00 00 00 00 PAD
--- continued ---
EDIT #2:这是我的另一个要求 --
如果我有一个像下面这样的输入 sn-p --
0 0 0x230 [0x60]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x3f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
0x290 [0x88]: event: 1
.
. ... raw event: size 136 bytes
. 0000: 01 00 00 00 01 00 88 00 ff ff ff ff 00 00 00 00 ................
. 0010: 00 00 00 c0 ff ff ff ff 00 90 00 00 00 00 00 00 ................
. 0020: 00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64 ......../lib/mod
. 0030: 75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65 ules/4.4.0-83-ge
. 0040: 6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 64 72 69 neric/kernel/dri
. 0050: 76 65 72 73 2f 61 74 61 2f 6c 69 62 61 68 63 69 vers/ata/libahci
. 0060: 2e 6b 6f 00 00 00 00 00 00 00 00 00 00 00 00 00 .ko.............
. 0070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0080: 00 00 00 00 00 00 00 00 ........
0x11590 [0x30]: PERF_RECORD_AUXTRACE size: 0x2002a0 offset: 0 ref: 0x2d44e6441a3c2 idx: 0 tid: -1 cpu: 0
.
. ... Intel Processor Trace data: size 2097824 bytes
. 00000000: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
. 00000010: 00 00 00 PAD
. 00000013: 99 20 MODE.TSX TXAbort:0 InTX:0
. 00000015: 99 01 MODE.Exec 64
. 00000017: 7d 08 45 06 81 ff ff 00 FUP 0xffff81064508
. 0000001f: 00 00 00 00 00 00 00 PAD
. 00000026: 02 43 00 76 49 1f 00 00 PIP 0xfa4bb00 (NR=0)
. 0000002e: 00 00 00 00 00 00 00 00 PAD
. 00000036: 02 c8 c2 3a 7c 00 00 00 VMCS 0x7c3ac2
0 0 0x290 [0x88]: PERF_RECORD_MMAP -1/0: [0xffffffffc0000000(0x9000) @ 0]: x /lib/modules/4.4.0-83-generic/kernel/drivers/ata/libahci.ko
0x318 [0x98]: event: 1
.
. ... raw event: size 152 bytes
. 0000: 01 00 00 00 01 00 98 00 ff ff ff ff 00 00 00 00 ................
. 0010: 00 90 00 c0 ff ff ff ff 00 50 00 00 00 00 00 00 .........P......
. 0020: 00 00 00 00 00 00 00 00 2f 6c 69 62 2f 6d 6f 64 ......../lib/mod
. 0030: 75 6c 65 73 2f 34 2e 34 2e 30 2d 38 33 2d 67 65 ules/4.4.0-83-ge
. 0040: 6e 65 72 69 63 2f 6b 65 72 6e 65 6c 2f 64 72 69 neric/kernel/dri
. 0050: 76 65 72 73 2f 76 69 64 65 6f 2f 66 62 64 65 76 vers/video/fbdev
. 0060: 2f 63 6f 72 65 2f 66 62 5f 73 79 73 5f 66 6f 70 /core/fb_sys_fop
. 0070: 73 2e 6b 6f 00 00 00 00 00 00 00 00 00 00 00 00 s.ko............
. 0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0090: 00 00 00 00 00 00 00 00 ........
0x11590 [0x30]: PERF_RECORD_AUXTRACE size: 0x2002a0 offset: 0 ref: 0x2d44e6441a3c2 idx: 0 tid: -1 cpu: 0
.
. ... Intel Processor Trace data: size 2097824 bytes
. 00000000: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
. 00000010: 00 00 00 PAD
. 00000013: 99 20 MODE.TSX TXAbort:0 InTX:0
. 00000015: 99 01 MODE.Exec 64
. 00000017: 7d 08 45 06 81 ff ff 00 FUP 0xffff81064508
. 0000001f: 00 00 00 00 00 00 00 PAD
. 00000026: 02 43 00 76 49 1f 00 00 PIP 0xfa4bb00 (NR=0)
. 0000002e: 00 00 00 00 00 00 00 00 PAD
. 00000036: 02 c8 c2 3a 7c 00 00 00 VMCS 0x7c3ac2
我只需要包含PERF_RECORD_AUXTRACE 的记录下的数据,就像这样。如果第一行包含
英特尔处理器跟踪数据:大小 2097824 字节
也可以从我的输出中避免。
. 00000000: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
. 00000010: 00 00 00 PAD
. 00000013: 99 20 MODE.TSX TXAbort:0 InTX:0
. 00000015: 99 01 MODE.Exec 64
. 00000017: 7d 08 45 06 81 ff ff 00 FUP 0xffff81064508
. 0000001f: 00 00 00 00 00 00 00 PAD
. 00000026: 02 43 00 76 49 1f 00 00 PIP 0xfa4bb00 (NR=0)
. 0000002e: 00 00 00 00 00 00 00 00 PAD
. 00000000: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB
. 00000010: 00 00 00 PAD
. 00000013: 99 20 MODE.TSX TXAbort:0 InTX:0
. 00000015: 99 01 MODE.Exec 64
. 00000017: 7d 08 45 06 81 ff ff 00 FUP 0xffff81064508
. 0000001f: 00 00 00 00 00 00 00 PAD
. 00000026: 02 43 00 76 49 1f 00 00 PIP 0xfa4bb00 (NR=0)
. 0000002e: 00 00 00 00 00 00 00 00 PAD
编辑#3:这是我最初尝试做的......但显然不起作用!
cat "$file" | gawk -F' ' -- '
/PERF_RECORD_AUXTRACE / {
offset = strtonum($1)
hsize = strtonum(substr($2, 2))
size = strtonum($5)
idx = strtonum($11)
ext = ""
ofile = sprintf("raw-pt.txt")
begin = offset + hsize
cmd = sprintf("dd if=%s of=%s conv=notrunc oflag=append ibs=1 " \
"count=%d status=none", file, ofile, size)
#!cmd = sprintf("sed p")
if (dry_run != 0) {
print cmd
}
else {
system(cmd)
}
}
我不太确定如何正确解析此文件以准确获得我想要的。我也不确定使用 Python 是否有帮助。
如何解决这个问题?
【问题讨论】:
-
仍然模棱两可,“包含 PERF_RECORD_AUXTRACE 的记录后的所有行”...直到文件结尾或直到下一个不同部分(如 PERF_RECORD_MMAP)的开始?请在你的 Q 正文中澄清,而不是在 cmets 中。如果 Q 是明确的,我会删除它。祝你好运。
-
嗨@shellter,我已经为我的问题添加了更多细节。我希望你能明白。
-
我已投票结束此问题,因为它似乎是请求推荐工具或解决方案,而不是请求帮助您自己的代码。这使您的问题与 StackOverflow 无关。如果该评估不正确,并且您确实需要帮助编写自己的代码,那么请add your work so far to your question,我将撤回我的近距离投票。您已将您的问题标记为
bash和awk。我希望在您的问题中看到 bash 和 awk 代码。 -
嗨@ghoti,我在我的问题中添加了我的“不工作”awk 代码..
标签: bash shell file parsing awk