【发布时间】:2017-09-06 17:08:27
【问题描述】:
我有一个 tshark 的 pcap 文件,其中包含我想要分析的数据。我想分析它并导出到 CSV 或 xls 文件。在tshark documentation 中,我可以看到我可以将-z 选项与适当的参数一起使用,也可以将-T 与-E 和-e 一起使用。我在 Debian 机器上使用 python 3.6。目前,我的命令如下所示:
command="tshark -q -o tcp.relative_sequence_numbers:false -o tcp.analyze_sequence_numbers:false " \
"-o tcp.track_bytes_in_flight:false -Q -l -z diameter,avp,272,Session-Id,Origin-Host," \
"Origin-Realm,Destination-Realm,Auth-Application-Id,Service-Context-Id,CC-Request-Type,CC-Request-Number," \
"Subscription-Id,CC-Session-Failover,Destination-Host,User-Name,Origin-State-Id," \
"Multiple-Services-Credit-Control,Requested-Service-Unit,Used-Service-Unit,SN-Total-Used-Service-Unit," \
"SN-Remaining-Service-Unit,Service-Identifier,Rating-Group,User-Equipment-Info,Service-Information," \
"Route-Record,Credit-Control-Failure-Handling -r {}".format(args.input_file)
稍后我将使用 pandas 数据框来处理它,如下所示:
# loops adding TCP and/or UDP ports to scan traffic from
if args.tcp:
for port in args.tcp:
command += " -d tcp.port=={},diameter".format(port)
if args.udp:
for port in args.udp:
command += " -d udp.port=={},diameter".format(port)
# calling subprocess with output redirection to task variable
task = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE)
# a loop adding new data dictionaries to data_list
for line in task.stdout:
line = re.sub(r"'", "", line.decode("utf-8")) # firstly, decode byte string and get rid of '
# secondly, split string every whitespace or = and obtain dictionary-like list of keys, values
line = re.split(r"\s|=", line)
# convert obtained list to ordered dictionary to preserve column order
# transform list to dictionary so that each i item is dictionary key and i+1 item is it's value
dict = OrderedDict(line[i:i+2] for i in range(0, len(line)-2, 2))
data_list.append(dict)
# remove last 4 dictionaries (last 4 lines of task.stdout)
data_list = data_list[:-4]
df = pd.DataFrame(data_list).fillna("-") # create data frame from list of dicts and fill each NaN with "-"
df.to_excel("{}.xls".format(args.output_file), index=False)
print("Please remember that 'frame' column may not correspond to row index!")
当我打开输出文件时,我可以看到它工作正常,除了在例如CC-Request-Number 我有数值而不是字符串表示,例如在 Wireshark 我有这样的数据:
并且在CC-Request-Number列的输出excel文件中,我可以在与此数据包对应的行中看到3,而不是TERMINATION-REQUEST。
我的问题是:如何在使用 -z 选项时将此数字转换为其字符串表示形式,或者(我可以从我在网上看到的内容猜测)如何使用他们的使用-T 和-e 命令的值?我用tshark -G 列出了所有可用的字段,但是它们太多了,我想不出任何合理的方法来找到我想要的。
【问题讨论】: