【发布时间】:2021-08-06 02:08:56
【问题描述】:
我的 XML 文件包含 10k 个用户,我需要删除电子邮件不包含 @acme.com 的所有用户
<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
<user>
<id type="integer">14000760626</id>
<name> Credentialing Department</name>
<email>user1@acme.com</email>
<created-at type="dateTime">2020-03-26T10:23:34-04:00</created-at>
<updated-at type="dateTime">2020-03-26T10:23:34-04:00</updated-at>
<active type="boolean">false</active>
<job-title></job-title>
<phone>1234567890</phone>
<mobile>1234567890</mobile>
<description></description>
<time-zone>Eastern Time (US & Canada)</time-zone>
<deleted type="boolean">false</deleted>
<language>en</language>
<address></address>
<external-id nil="true"/>
<helpdesk-agent type="boolean">false</helpdesk-agent>
<location-name nil="true"/>
<time-format>12h</time-format>
<company-names type="array"/>
<custom_field>
</custom_field>
</user>
</users>
我尝试关注how do I filter values from XML file in python,但在更改以下行时卡住了:
>>> xmldata.xpath('/localization/b[@n="Levels"]/l[@k=$level]/v/text()',level='Level1')
['Beginner Level']
我也尝试了其他方法,但总是会丢失一些数据,示例结果:
<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
<user>
<id>14000760626</id>
<name> Credentialing Department</name>
<email>test@aoncology.com</email>
<created-at>2020-03-26T10:23:34-04:00</created-at>
<updated-at>2020-03-26T10:23:34-04:00</updated-at>
<active>false</active>
<job-title>None</job-title>
<phone>1234567890</phone>
<mobile>1234567890</mobile>
<description>None</description>
<time-zone>Eastern Time (US & Canada)</time-zone>
<deleted>false</deleted>
<language>en</language>
<address>None</address>
<external-id>None</external-id>
<helpdesk-agent>false</helpdesk-agent>
<location-name>None</location-name>
<time-format>12h</time-format>
<company-names>None</company-names>
<custom_field>
</custom_field>
</user>
</users>
【问题讨论】:
标签: python xml-parsing