【问题标题】:XML to Python dictionaryXML 到 Python 字典
【发布时间】:2021-12-29 07:07:11
【问题描述】:

我有一个 xml 文件,其中的内容是树形的。如下所示

<?xml version="1.0" encoding="UTF-8"?> <fee_config> <fees member_group="00400F" mail_retail="MAIL"> <admin_fee>0.76</admin_fee> <processing_fee>1.83</processing_fee> </fees> <fees member_group="00400F" mail_retail="RETAIL"> <admin_fee>1.335</admin_fee> <processing_fee>1.645</processing_fee> </fees> <fees member_group="00460G" mail_retail="MAIL"> <admin_fee>0.88</admin_fee> <processing_fee>1.18</processing_fee> </fees>

有哪些方法可以将它转换为 python 中的简单字典?

【问题讨论】:

标签: python xml


【解决方案1】:

从 XML 到 Python 字典没有唯一的映射;一个是节点树,另一个是hash map,它只是一个“苹果和其他东西的比较”,所以你必须自己做出设计决定,考虑你想要什么。

Sreehari 的链接有一个解决方案,可以很好地将 lxml 节点转换为 Python 字典,但是:

  • 它需要 lxml,这很好,但我喜欢标准模块来完成这项工作
  • 它不捕获属性

我采用了该代码并将其转换为与 Python 的标准 xml.ElementTree 模块/类一起使用,它以自己的方式处理属性。

当我对您的示例运行此代码时,我得到以下 dict:

{'fees': [{'@attribs': {'mail_retail': 'MAIL', 'member_group': '00400F'},
           'admin_fee': '0.76',
           'processing_fee': '1.83'},
          {'@attribs': {'mail_retail': 'RETAIL', 'member_group': '00400F'},
           'admin_fee': '1.335',
           'processing_fee': '1.645'},
          {'@attribs': {'mail_retail': 'MAIL', 'member_group': '00460G'},
           'admin_fee': '0.88',
           'processing_fee': '1.18'}]}

注意@attribs 键,这就是我决定应该存储属性的方式。如果您需要其他内容,可以根据自己的喜好进行修改:

#!/usr/bin/env python3
from xml.etree import ElementTree as ET
from pprint import pprint


def elem2dict(node):
    """
    Convert an xml.ElementTree node tree into a dict.
    """
    result = {}

    for element in node:
        key = element.tag
        if '}' in key:
            # Remove namespace prefix
            key = key.split('}')[1]
        
        if node.attrib:
            result['@attribs'] = dict(node.items())

        # Process element as tree element if the inner XML contains non-whitespace content
        if element.text and element.text.strip():
            value = element.text
        else:
            value = elem2dict(element)

        # Check if a node with this name at this depth was already found
        if key in result:
            if type(result[key]) is not list:
                # We've seen it before, but only once, we need to convert it to a list
                tempvalue = result[key].copy()
                result[key] = [tempvalue, value]
            else:
                # We've seen it at least once, it's already a list, just append the node's inner XML
                result[key].append(value)
        else:
            # First time we've seen it
            result[key] = value

    return result


root = ET.parse('input.xml').getroot()
pprint(elem2dict(root))

【讨论】:

猜你喜欢
  • 2011-10-21
  • 2015-04-25
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2014-10-04
  • 2019-06-09
  • 1970-01-01
相关资源
最近更新 更多