【问题标题】:Splitting XML file into multiple at given tags ex: tags using python在给定的标签处将 XML 文件拆分为多个例如:使用 python 的标签
【发布时间】:2019-12-26 10:01:42
【问题描述】:

您好,我有一个大的 xml 文件,我想根据 id 将该 xml 文件拆分为多个文件(这里 id 是唯一的)。目前我有 3 个唯一标签 ID 的 xml 文件,我想拆分它们。

我的文件如下

ma​​in.xml

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tal xmlns="http://hello.com" schemaVersion="5.0.0" refSchemaFile="tal.xsd" status="Executable">
    <ExecProperties supportsParallelMostFlash="false"/>
    <talLine id="tl_1" status="Executable" baseVariant="DKOMBI8" diagAddress="60">
        <blFlash status="Executable">
            <blFlashTA status="Executable">
                <sgbmid>
                    <processClass>BTLD</processClass>
                    <id>00007732</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </blFlashTA>
            <blFlashTA status="Executable">
                <sgbmid>
                    <processClass>FLSL</processClass>
                    <id>00007735</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </blFlashTA>
        </blFlash>
    </talLine>
    <talLine id="tl_2" status="Executable" baseVariant="DKOMBI8" diagAddress="60">
        <swDeploy status="Executable">
            <swDeployTA status="Executable">
                <sgbmid>
                    <processClass>SWFL</processClass>
                    <id>00007736</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </swDeployTA>
            <swDeployTA status="Executable">
                <sgbmid>
                    <processClass>SWFL</processClass>
                    <id>00007bfc</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </swDeployTA>
        </swDeploy>
    </talLine>
    <talLine id="tl_3" status="Executable" baseVariant="DKOMBI8" diagAddress="60">
        <cdDeploy status="Executable">
            <cdDeployTA status="Executable">
                <sgbmid>
                    <processClass>CAFD</processClass>
                    <id>00006d4e</id>
                    <mainVersion>0</mainVersion>
                    <subVersion>4</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </cdDeployTA>
        </cdDeploy>
    </talLine>
    <executionTime actualEndTime="0" actualStartTime="0" plannedEndTime="0" plannedStartTime="0"/>
    <installedECUList_Ist/>
    <installedECUList_Soll/>
</tal>

我需要一个带有页眉和页脚的文件中的每个“id”数据(你可以观察下面的文件),上面是示例文件。我需要一个像下面这样的吐出的文件

1.xml

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tal xmlns="http://hello.com" schemaVersion="5.0.0" refSchemaFile="tal.xsd" status="Executable">
    <ExecProperties supportsParallelMostFlash="false"/>
    <talLine id="tl_1" status="Executable" baseVariant="DKOMBI8" diagAddress="60">
        <blFlash status="Executable">
            <blFlashTA status="Executable">
                <sgbmid>
                    <processClass>BTLD</processClass>
                    <id>00007732</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </blFlashTA>
            <blFlashTA status="Executable">
                <sgbmid>
                    <processClass>FLSL</processClass>
                    <id>00007735</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </blFlashTA>
        </blFlash>
    </talLine>
    <executionTime actualEndTime="0" actualStartTime="0" plannedEndTime="0" plannedStartTime="0"/>
    <installedECUList_Ist/>
    <installedECUList_Soll/>
</tal>

2.xml

  <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tal xmlns="http://hello.com" schemaVersion="5.0.0" refSchemaFile="tal.xsd" status="Executable">
    <ExecProperties supportsParallelMostFlash="false"/>
    <talLine id="tl_3" status="Executable" baseVariant="DKOMBI8" diagAddress="60">
        <cdDeploy status="Executable">
            <cdDeployTA status="Executable">
                <sgbmid>
                    <processClass>CAFD</processClass>
                    <id>00006d4e</id>
                    <mainVersion>0</mainVersion>
                    <subVersion>4</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </cdDeployTA>
        </cdDeploy>
    </talLine>
    <executionTime actualEndTime="0" actualStartTime="0" plannedEndTime="0" plannedStartTime="0"/>
    <installedECUList_Ist/>
    <installedECUList_Soll/>
</tal>

3.xml

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tal xmlns="http://hello.com" schemaVersion="5.0.0" refSchemaFile="tal.xsd" status="Executable">
    <ExecProperties supportsParallelMostFlash="false"/>
    <talLine id="tl_2" status="Executable" baseVariant="DKOMBI8" diagAddress="60">
        <swDeploy status="Executable">
            <swDeployTA status="Executable">
                <sgbmid>
                    <processClass>SWFL</processClass>
                    <id>00007736</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </swDeployTA>
            <swDeployTA status="Executable">
                <sgbmid>
                    <processClass>SWFL</processClass>
                    <id>00007bfc</id>
                    <mainVersion>2</mainVersion>
                    <subVersion>3</subVersion>
                    <patchVersion>11</patchVersion>
                </sgbmid>
            </swDeployTA>
        </swDeploy>
    </talLine>
    <executionTime actualEndTime="0" actualStartTime="0" plannedEndTime="0" plannedStartTime="0"/>
    <installedECUList_Ist/>
    <installedECUList_Soll/>
</tal>

我试图删除一些带有 id 的特定标签数据,但没有成功。你能建议我更好的方法来实现我的目标吗?

import xml.etree.ElementTree as ET
tree = ET.parse('main.xml')
root = tree.getroot()
mydata = root.find(".talLine[@id='tl_1']")
mydata.remove(mydata)

提前致谢。

【问题讨论】:

  • 我似乎看到您的代码显示了您尝试过的内容,或者关于您的代码的问题说明了什么不起作用
  • 嗨@ChrisDoyle 我对python很陌生,即使我不知道如何继续。
  • 那么堆栈溢出不是您要查找的站点。这不是一个教程或问一个问题,希望有人为您的网站编写代码。有很多网站会教你如何用 python 解析 xml
  • 嗨@ChrisDoyle,我已经尝试了一些东西,请您检查一下。

标签: python python-3.x xml xml-parsing


【解决方案1】:

我们只需要从根目录中找出您的标签的索引并清除它们并保存到新的 xml 文件中

import xml.etree.ElementTree as ET
mytree=ET.parse('D://talfiles//TAL_High_Hud_Dcs_002_003_011.xml')
myroot=mytree.getroot()
myroot[1].clear()
myroot[2].clear()
mytree.write('D://talfiles//1.xml')

然后你的输出将被保存到新文件中

1.xml 文件

<ns0:tal xmlns:ns0="http:hello.com" refSchemaFile="tal.xsd" schemaVersion="5.0.0" status="Executable">
<ns0:ExecProperties supportsParallelMostFlash="false" />
<ns0:talLine /><ns0:talLine /><ns0:talLine baseVariant="DKOMBI8" diagAddress="60" id="tl_3" status="Executable">
    <ns0:cdDeploy status="Executable">
        <ns0:cdDeployTA status="Executable">
            <ns0:sgbmid>
                <ns0:processClass>CAFD</ns0:processClass>
                <ns0:id>00006d4e</ns0:id>
                <ns0:mainVersion>0</ns0:mainVersion>
                <ns0:subVersion>4</ns0:subVersion>
                <ns0:patchVersion>11</ns0:patchVersion>
            </ns0:sgbmid>
        </ns0:cdDeployTA>
    </ns0:cdDeploy>
</ns0:talLine>
<ns0:executionTime actualEndTime="0" actualStartTime="0" plannedEndTime="0" plannedStartTime="0" />
<ns0:installedECUList_Ist />
<ns0:installedECUList_Soll />

但是这里我们有一个挑战,我们需要避免在输出 xml 文件的每一行中出现 ns0 字符串

【讨论】:

    猜你喜欢
    • 2016-07-09
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-03-01
    • 2019-01-19
    相关资源
    最近更新 更多