【问题标题】:Convert complicated XML to CSV将复杂的 XML 转换为 CSV
【发布时间】:2020-07-14 14:55:32
【问题描述】:

我的 XML 结构如下:

<result>
    <report>
        <id>111</id>
        <user>username1</user>
        <actions_list>
            <action1>
                <id>a_1</id>
            </action1>
            <action1>
                <id>a_2</id>
            </action1>
            <action1>
                <id>a_3</id>
            </action1>
        </actions_list>
    </report>

    <report>
        <id>222</id>
        <user>username2</user>
        <actions_list>
            <action1>
                <id>a_1</id>
            </action1>
            <action2>
                <id>a_2</id>
            </action2>
            <action3>
                <id>a_3</id>
            </action3>
            <action4>
                <id>a_4</id>
            </action4>
            <action5>
                <id>a_5</id>
            </action5>
        <actions_list>
    </report>
</result>

所以,我想创建一个结构如下的 CSV 文件:

+---+-----+-----------+-----+
| 1 | 111 | username1 | a_1 |
+---+-----+-----------+-----+
| 1 | 111 | username1 | a_2 |
+---+-----+-----------+-----+
| 1 | 111 | username1 | a_3 |
+---+-----+-----------+-----+
| 2 | 222 | username2 | a_1 |
+---+-----+-----------+-----+
| 2 | 222 | username2 | a_2 |
+---+-----+-----------+-----+
| 2 | 222 | username2 | a_3 |
+---+-----+-----------+-----+
| 2 | 222 | username2 | a_4 |
+---+-----+-----------+-----+
| 2 | 222 | username2 | a_5 |
+---+-----+-----------+-----+

我尝试使用 python BeautifulSoup 和 xml.etree,但无法处理具有相同名称(在我的示例中为“id”)和不同报告中不同数量的操作的字段。我该怎么做?任何帮助将非常感激。提前致谢。

【问题讨论】:

    标签: python xml parsing


    【解决方案1】:

    尝试以下方法:

    import xml.etree.ElementTree as ET
    import csv
    
    with open("yourfile.csv) as f:
        writer = csv.writer(f)
    
        # Use either of the next two lines, depending on whether you're reading from a 
    file
        root = ET.fromstring(your_xml_str)
        root = ET.parse("user_actions.xml")
        
        for reportIndex, report in enumerate(root, start=1):
            id = report.find("id").text
            user = report.find("user").text
            actions_list = report.find("actions_list")
            
            for action in actions_list:
                action_id = action.find("id").text
                writer.writerow([reportIndex, id, user, action_id])
            
            
    
    
    

    【讨论】:

    • 谢谢,它对我有用!但我做了一个小改动:root = parse("user_actions.xml").getroot()
    猜你喜欢
    • 2019-12-11
    • 2020-12-09
    • 2014-08-07
    • 2020-06-10
    • 2021-10-29
    • 1970-01-01
    • 2013-11-14
    • 2017-08-26
    • 1970-01-01
    相关资源
    最近更新 更多