【问题标题】:Parse XML to file with array将 XML 解析为包含数组的文件
【发布时间】:2019-02-21 12:34:19
【问题描述】:

我需要生成下面的文件。使用python解析XML样本:

示例 XML

<fruits>
<tag beginTime="20181125020000" endTime="20181202020000">
<EventId>16778</EventId>
    <item color="red">
        <name>apple</name>
        <count>1</count>
        <subtag>
            <Info name="Eid">396</Info>
            <Info name="New">397</Info>
        </subtag>
    </item>
    <item color="yellow">
        <name>banana</name>
        <count>2</count>
        <subtag>
            <Info name="Eid">500</Info>
            <Info name="New">650</Info>
            <Info name="Col">999</Info>
        </subtag>
    </item>
</tag>  

期望的输出:

20181125020000|20181202020000|16778|red|apple|1|Eid;396;New;397|
20181125020000|20181202020000|16778|yelow|banana|1|Eid;500;New;650;Col;999|

【问题讨论】:

标签: python xml parsing elementtree minidom


【解决方案1】:

另一种方法是将XML转换为json

import xmltodict

with open('file.xml') as f:
    d = xmltodict.parse(f.read())['fruits']['tag']

for i in d['item']:
    subtag = []
    for s in i['subtag']['Info']:
        subtag.append('{};{}'.format(s['@name'], s['#text']))
    print('{}|{}|{}|{}|{}|{}|{}|'.format(d['@beginTime'], d['@endTime'], d['EventId'], i['@color'], i['name'], i['count'], ';'.join(subtag)))

输出:

20181125020000|20181202020000|16778|red|apple|1|Eid;396;New;397|
20181125020000|20181202020000|16778|yellow|banana|2|Eid;500;New;650;Col;999|

【讨论】:

    【解决方案2】:

    试试这个代码。

    import xml.etree.ElementTree as Et
    
    file = Et.parse('some.xml')
    
    tags = file.findall('tag')
    for tag in tags:
    temp1 = []
    beginTime = tag.get('beginTime')
    temp1.append(beginTime)
    endTime = tag.get('endTime')
    temp1.append(endTime)
    eventId = tag.find('EventId').text
    temp1.append(eventId)
    items = tag.findall('item')
    
    for item in items:
        temp2 = []
        color = item.get('color')
        temp2.append(color)
        name = item.find('name').text
        temp2.append(name)
        count = item.find('count').text
        temp2.count(count)
        infos = item.find('subtag').findall('Info')
    
        temp3 = []
        for info in infos:
            name = info.get('name')
            value = info.text
            temp3.append(name)
            temp3.append(value)
        temp3 = [';'.join(temp3)]
        result = temp1 + temp2 + temp3
        result = '|'.join(result)
        print(result)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2014-03-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多