【问题标题】:Parse XML tag's attribute value via ElementTree AND replace the value string通过 ElementTree 解析 XML 标记的属性值并替换值字符串
【发布时间】:2016-08-12 21:42:11
【问题描述】:

我是 Python 新手,正在尝试自动执行一项任务。我花了两天时间阅读文档,研究了各种其他类似的问题,但现在,我碰壁了,无法继续前进。

我觉得 Python 文档在 Elementtree 模块上没有很好的文档记录。也许这只是我。另外,我知道我可以使用其他模块。但请仅使用 Elementtree 指导我。请帮我指导前进。

任务是解析 XML 并使用 Elementtree 替换所有标签的属性值。在 web-server-parm 中,我需要替换所有包含“http://api-stg.link.com”的链接。比如……

来自

"ServerAddr="http://api-stg.link.com/dataapi/v2/exchangerates/"

"ServerAddr="http://api-DATA-stg.link.com/dataapi/v2/exchangerates/"

XML test.xml

<?xml version="1.0" encoding="utf-8"?>
<ConfigRoot>
  <max-layer layer="5"/>
  <enabled-cache status="0"/>
  <server type="fgrfreefr">
    <web-server-parm mode="QA" ServerAddr="http://api-stg.link.com/dataapi/v2/securities?response=complex&amp;limit=9999999" timedOut="10000" X-API-UserId="54456464561" X-API-ProductId="ADS" ApiKey="fgggdfvdffdgdfg"/>
    <web-server-parm mode="UAT" ServerAddr="http://api-uat.link.com/dataapi/v2/securities?response=complex&amp;limit=9999999" timedOut="10000" X-API-UserId="gdfsgvhdgjfjuhgdyejhgsfegtb" X-API-ProductId="ADS" ApiKey="@gggf-fsffff@"/>
  </server>
  <server type="vfffdg">
    <web-server-parm mode="QA" ServerAddr="http://api-stg.link.com/dataapi/v2/exchangerates/" timedOut="10000" X-API-UserId="gsfsftfdfrfefrferf" X-API-ProductId="ADS" ApiKey="fgvdgggdfgttggefr"/>
    <web-server-parm mode="UAT" ServerAddr="http://api-uat.link.com/dataapi/v2/exchangerates/" timedOut="10000" X-API-UserId="gdfdagtgdfsgtrsdfsg" X-API-ProductId="ADS" ApiKey="@hdvfddfdd"/>
  </server>
</ConfigRoot>

Task.py 这就是我目前所拥有的

import xml.etree.ElementTree as ET 
# import XML, SubElement, Element, tostring

#----------------------------------------------------------------------
def parseXML(xml_file):
    """
    Parse XML with ElementTree
    """
    tree = ET.ElementTree(file=xml_file)
    root = tree.getroot()

    # get the information via the children!

    print "Iterating using getchildren()"

    node = root.getchildren()
    for server_addr in node:
        node_children = server_addr.getchildren()
        for node_child in node_children:
            print "_________passed__________"
            print "%s=%s" % (node_child.attrib, node_child.text)
            test = node_child.findtext("http://api-stg.link.com/dataapi/v2/exchangerates/")
            if test is None:
                continue
            tests = test.text
            print tests

# #----------------------------------------------------------------------
if __name__ == "__main__":
    parseXML("test/test.xml")

【问题讨论】:

  • 您是在用ServerAddr="http://api-DATA-stg.link.com/dataapi/v2/exchangerates/ 替换ServerAddr="http://api-stg.link.com/dataapi/v2/exchangerates/ 吗?如果是这样,您可以考虑只做一个替换:new_xml.replace('ServerAddr="http://api-stg.link.com/dataapi/v2/exchangerates/', 'ServerAddr="http://api-DATA-stg.link.com/dataapi/v2/exchangerates/')

标签: python xml parsing elementtree


【解决方案1】:

考虑在元素中使用iter() 并带有条件的if 替换:

import xml.etree.ElementTree as ET 

#----------------------------------------------------------------------
def parseXML(xml_file):
    """
    Parse XML with ElementTree
    """
    tree = ET.ElementTree(file=xml_file)
    root = tree.getroot()

    # get the information via the children!
    print("Iterating using getchildren()")

    for serv in root.iter('server'):
        for web in serv.iter('web-server-parm'):
                if 'http://api-stg.link.com' in web.get('ServerAddr'):
                    web.set('ServerAddr', web.get('ServerAddr').\
                        replace("http://api-stg.link.com", "http://api-DATA-stg.link.com"))

    print(ET.tostring(root).decode("UTF-8"))

    tree.write("ConfigRoot_py.xml")

# #----------------------------------------------------------------------
if __name__ == "__main__":
    parseXML("ConfigRoot.xml")

输出

<ConfigRoot>
  <max-layer layer="5" />
  <enabled-cache status="0" />
  <server type="fgrfreefr">
    <web-server-parm ApiKey="fgggdfvdffdgdfg" ServerAddr="http://api-DATA-stg.link.com/dataapi/v2/securities?response=complex&amp;limit=9999999" X-API-ProductId="ADS" X-API-UserId="54456464561" mode="QA" timedOut="10000" />
    <web-server-parm ApiKey="@gggf-fsffff@" ServerAddr="http://api-uat.link.com/dataapi/v2/securities?response=complex&amp;limit=9999999" X-API-ProductId="ADS" X-API-UserId="gdfsgvhdgjfjuhgdyejhgsfegtb" mode="UAT" timedOut="10000" />
  </server>
  <server type="vfffdg">
    <web-server-parm ApiKey="fgvdgggdfgttggefr" ServerAddr="http://api-DATA-stg.link.com/dataapi/v2/exchangerates/" X-API-ProductId="ADS" X-API-UserId="gsfsftfdfrfefrferf" mode="QA" timedOut="10000" />
    <web-server-parm ApiKey="@hdvfddfdd" ServerAddr="http://api-DATA-stg.link.com/dataapi/v2/exchangerates/" X-API-ProductId="ADS" X-API-UserId="gdfdagtgdfsgtrsdfsg" mode="UAT" timedOut="10000" />
  </server>
</ConfigRoot>

【讨论】:

  • 谢谢@Parfait!但是,我有一个巨大的 XML,其中包含各种链接。我不仅想用“汇率”编辑 http 链接,还想用“证券”来编辑。
  • 在你当前的输出中,它只改变了服务器的第二块。
  • 但是在您的帖子中,您指定只替换这个:http://api-stg.link.com/dataapi/v2/exchangerates/,它在您发布的 XML 中只显示一次。
  • 是的,这是我不清楚的原因。你知道我该如何解决这个问题吗?
  • 使用str.replace()查看更新的脚本,用"http://api-DATA-stg.link.com"替换"http://api-stg.link.com" stem。
猜你喜欢
  • 1970-01-01
  • 2022-01-26
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-11-26
  • 2013-11-19
  • 1970-01-01
  • 2015-02-20
相关资源
最近更新 更多