【发布时间】:2019-07-24 18:07:55
【问题描述】:
我有一个很大的文本文件,我需要根据以下几行开头的特定条件提取某些数据块。如何使用 Python regex 包找到这些块并提取它们?
示例文件(source.txt)如下所示。
.
.
.
Request: 22:11:22
Discription1: From the Client 1
Discription2: requesting HTTP
Version: 1.2
Type: browsing
Data: AAAA CFFFF FFF
Answer: 33:22:44
Discription1: From Server B
Discription2: Respons HHTP
Version: 1.1
Type: browsing
Data: kCmkc9AS 9as9 as99 as76d 8aS9d8 6ASDQWv sf
Request: 31:24:53:33
Discription1: From Client 2
Discription2: requesting HTTP
Version: 1.1
Type: DASH
Data: AAAA CFFFF FFF
Answer: 41:24:33:33
Discription1: From Server A
Discription2: Response
Version: 1.1
Type: DASH
Data:ask sef k5q3 WEB 54 fkl n5 qwe@#%@#SDG adkjwra;k4 kfk
Request: 61:44:23:33
Discription1: From Client 2
Discription2: requesting HTTP
Version: 1.1
Type: DASH
Data: AAAA CFFFF FFF
Data Discription: From the Cleint VM2
Answer: 71:25:33:33
Discription1: From Server A
Discription2: Response
Version: 1.1
Type: DASH
Data:ask sef k5q3 WEB 54 fkl n5 qwe@#%@#SDG adkjwra;k4 kfk
.
.
我需要获取以“Request:”开头的块,其特征是:“version 1.1”和“Client 2”
重要提示
块的长度不同,所以它们不一样 信息,但它们具有相同的匹配特征。
它们之间有很多空格和换行符。
匹配的特征可能不完全出现在特定的行中 顺序。
我需要将这些块捕获到以下“Answer”关键字。**
预期的输出是:
Request: 31:24:53:33
Discription1: From Client 2
Discription2: requesting HTTP
Version: 1.1
Type: DASH
Data: AAAA CFFFF FFF
Request: 61:44:23:33
Discription1: From Client 2
Discription2: requesting HTTP
Version: 1.1
Type: DASH
Data: AAAA CFFFF FFF
Data Discription: From the Cleint VM2
【问题讨论】:
-
你需要比这更清楚。举个例子。
-
该问题现已被删除(10k+ 用户可以看到它,但他们将无法评论或回答)。
-
我已经更新了这个。
标签: python regex python-3.x regex-lookarounds regex-group