【发布时间】:2019-11-14 11:43:51
【问题描述】:
使用下面的 Python3 脚本,我能够解析 XML 记录并将其转换为列表,(通过从中提取值字段)。
请帮助改进它以使用 XML 记录中的名称“:”值打印。
例如:假设下面一块
<field name="RecordType" value="RESGJG"/>
<field name="RecordTypeHEC" value="PY"/>
得到输出
RESGJG, PY
需要的输出:
RecordType:RESGJG, RecordTypeHEC:PY
我的输入文件:dummy.xml(##请注意它有两条记录##每条记录都以record source="AJS/SHD"开头)
<?xml version="1.0" encoding="UTF-8"?>
<records>
<record source="AJS/SHD" type="call">
<group name="General">
<field name="RecordType" value="RESGJG"/>
<field name="RecordTypeHEC" value="PY"/>
<field name="NodeID" value="rock.dsjjgds.cm"/>
<field name="SequenceNumber" value="7937973"/>
<field name="StartDate" value="20171049979"/>
<field name="EndDate" value="201704059739793"/>
<field name="CallDuration" value="973979i"/>
<field name="CauseForRecordClosing" value="normal"/>
</group>
<group name="SIP">
<field name="ICID" value="dshhkdhs"/>
<field name="CallID" value="sdidydakyd2133@10.10.10.1"/>
<field name="User-Agent" value="NotPresent"/>
<field name="Request-URI" value="sip:+47668384"/>
<field name="CalledPartyNumber" value="sip:+08779379972"/>
<field name="CallingPartyNumber" value="sip:+07073873772@10.0.0.1"/>
<field name="To" value="sip:+878379739"/>
<field name="From" value="sip:+937973962"/>
</group>
<group name="VPN">
<field name="VPN_NAME_B" value="blshahd"/>
<field name="VPN_Group_B" value="ctr"/>
<field name="B_ExtType" value="part"/>
<field name="B_ISDN" value="7973"/>
<field name="B_SIP" value="67367672"/>
<field name="B_PABXID" value="797397"/>
</group>
</record>
<record source="AJS/SHD" type="call">
<group name="General">
<field name="RecordType" value="MESGJG"/>
<field name="RecordTypeHEC" value="DY"/>
<field name="NodeID" value="rock.dsjjgds.cm"/>
<field name="SequenceNumber" value="7937973"/>
<field name="StartDate" value="20171049979"/>
<field name="EndDate" value="201704059739793"/>
<field name="CallDuration" value="973979i"/>
<field name="CauseForRecordClosing" value="normal"/>
</group>
<group name="SIP">
<field name="ICID" value="dshhkdhs"/>
<field name="CallID" value="sdidydakyd2133@10.10.10.1"/>
<field name="User-Agent" value="NotPresent"/>
<field name="Request-URI" value="sip:+47668384"/>
<field name="CalledPartyNumber" value="sip:+08779379972"/>
<field name="CallingPartyNumber" value="sip:+07073873772@10.0.0.1"/>
<field name="To" value="sip:+878379739"/>
<field name="From" value="sip:+937973962"/>
</group>
<group name="VPN">
<field name="VPN_NAME_B" value="blshahd"/>
<field name="VPN_Group_B" value="ctr"/>
<field name="B_ExtType" value="part"/>
<field name="B_ISDN" value="7973"/>
<field name="B_SIP" value="67367672"/>
<field name="B_PABXID" value="797397"/>
</group>
</record>
</records>
我已经尝试过下面的脚本来解析 XML 字段并以列表格式打印。
import sys
import operator
from functools import reduce
from xml.etree.ElementTree import ElementTree
tree = ElementTree()
tree.parse("dummy.xml")
root = tree.getroot()
data = []
groups = root.findall('.//group')
for group in groups:
data.append([f.attrib['value'] for f in group.findall('./field')])
q = reduce(operator.concat, data)
s = ", ".join(q)
print(s)
输出为
RESGJG, PY, rock.dsjjgds.cm, 7937973, 20171049979, 201704059739793, 973979i, normal, dshhkdhs, sdidydakyd2133@10.10.10.1, NotPresent, sip:+47668384, sip:+08779379972, sip:+07073873772@10.0.0.1, sip:+878379739, sip:+937973962, blshahd, ctr, part, 7973, 67367672, 797397, MESGJG, DY, rock.dsjjgds.cm, 7937973, 20171049979, 201704059739793, 973979i, normal, dshhkdhs, sdidydakyd2133@10.10.10.1, NotPresent, sip:+47668384, sip:+08779379972, sip:+07073873772@10.0.0.1, sip:+878379739, sip:+937973962, blshahd, ctr, part, 7973, 67367672, 797397
需要的输出:
RecordType:RESGJG, RecordTypeHEC:PY, NodeID:rock.dsjjgds.cm, SequenceNumber:7937973, StartDate:20171049979, EndDate:201704059739793, CallDuration:973979i, CauseForRecordClosing:normal, ICID:dshhkdhs, CallID:sdidydakyd2133@10.10.10.1, User-Agent:NotPresent, Request-URI:sip:+47668384, CalledPartyNumber:sip:+08779379972, CallingPartyNumber:sip:+07073873772@10.0.0.1, To:sip:+878379739, From:sip:+937973962, VPN_NAME_B:blshahd, VPN_Group_B:ctr, B_ExtType:part, B_ISDN:7973, B_SIP:67367672, B_PABXID:797397,
RecordType:MESGJG, RecordTypeHEC:DY, NodeID:rock.dsjjgds.cm, SequenceNumber:7937973, StartDate:20171049979, EndDate:201704059739793, CallDuration:973979i, CauseForRecordClosing:normal, ICID:dshhkdhs, CallID:sdidydakyd2133@10.10.10.1, User-Agent:NotPresent, Request-URI:sip:+47668384, CalledPartyNumber:sip:+08779379972, CallingPartyNumber:sip:+07073873772@10.0.0.1, To:sip:+878379739, From:sip:+937973962, VPN_NAME_B:blshahd, VPN_Group_B:ctr, B_ExtType:part, B_ISDN:7973, B_SIP:67367672, B_PABXID:797397,
请帮帮我
【问题讨论】:
-
你只会得到
f.attrib['value']。您还需要获取f.attrib['name']... 并将data设为字典,因为您想要一本字典。
标签: python regex python-3.x xml