从 XML 到 Python 字典没有唯一的映射;一个是节点树,另一个是hash map,它只是一个“苹果和其他东西的比较”,所以你必须自己做出设计决定,考虑你想要什么。
Sreehari 的链接有一个解决方案,可以很好地将 lxml 节点转换为 Python 字典,但是:
- 它需要 lxml,这很好,但我喜欢标准模块来完成这项工作
- 它不捕获属性
我采用了该代码并将其转换为与 Python 的标准 xml.ElementTree 模块/类一起使用,它以自己的方式处理属性。
当我对您的示例运行此代码时,我得到以下 dict:
{'fees': [{'@attribs': {'mail_retail': 'MAIL', 'member_group': '00400F'},
'admin_fee': '0.76',
'processing_fee': '1.83'},
{'@attribs': {'mail_retail': 'RETAIL', 'member_group': '00400F'},
'admin_fee': '1.335',
'processing_fee': '1.645'},
{'@attribs': {'mail_retail': 'MAIL', 'member_group': '00460G'},
'admin_fee': '0.88',
'processing_fee': '1.18'}]}
注意@attribs 键,这就是我决定应该存储属性的方式。如果您需要其他内容,可以根据自己的喜好进行修改:
#!/usr/bin/env python3
from xml.etree import ElementTree as ET
from pprint import pprint
def elem2dict(node):
"""
Convert an xml.ElementTree node tree into a dict.
"""
result = {}
for element in node:
key = element.tag
if '}' in key:
# Remove namespace prefix
key = key.split('}')[1]
if node.attrib:
result['@attribs'] = dict(node.items())
# Process element as tree element if the inner XML contains non-whitespace content
if element.text and element.text.strip():
value = element.text
else:
value = elem2dict(element)
# Check if a node with this name at this depth was already found
if key in result:
if type(result[key]) is not list:
# We've seen it before, but only once, we need to convert it to a list
tempvalue = result[key].copy()
result[key] = [tempvalue, value]
else:
# We've seen it at least once, it's already a list, just append the node's inner XML
result[key].append(value)
else:
# First time we've seen it
result[key] = value
return result
root = ET.parse('input.xml').getroot()
pprint(elem2dict(root))