【问题标题】:Python minidom can't parse xmlPython minidom 无法解析 xml
【发布时间】:2015-02-19 11:23:49
【问题描述】:

我有 xml.file 如下,我正在尝试用 python minidom 解析它,但是有一些问题。我想提取一些属性为<ManagedElementId string<associatedSite string = "Site=site00972"/> 但没有运气。使用互联网上的 python minidom 教程我没能做到,所以我需要你帮助告诉我如何去做。这是我的尝试:

#!/usr/bin/python 
import os
import xml.dom.minidom
from xml.dom import minidom
from xml.dom.minidom import parseString,parse
from xml.dom.minidom import Node


xmldoc = minidom.parse("proba.xml")

model= xmldoc.getElementsByTagName('ManagedElementId string = ')
for node in model:
    print node.firstChild.nodeValue

我想在字符串之间获取值。

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE Model SYSTEM "/opt/ericsson/arne/etc/arne12_2.dtd">
<Model version = "1" importVersion = "12.2">
<!--Validate: /opt/ericsson/arne/bin/import.sh -f 4_siu_create.xml \ -val:rall -->
    <Create>
        <SubNetwork userLabel = "ZLNOUR_SIU" networkType = "IPRAN">
            <ManagedElement sourceType = "SIU">
                <ManagedElementId string = "siu009722"/>
                <primaryType type = "STN"/>
                <managedElementType types = ""/>
                <associatedSite string = "Site=site00972"/>
                <nodeVersion string = "T11A"/>
                <platformVersion string = ""/>
                <swVersion string = ""/>
                <vendorName string = ""/>
                <userDefinedState string = ""/>
                <managedServiceAvailability int = "1"/>
                <isManaged boolean = "true"/>
                <connectionStatus string = "OFF"/>
                <Connectivity>
                    <DEFAULT>
                        <emUrl url = "http://10.131.203.117:80/"/>
                        <ipAddress string = "10.131.203.117"/>
                        <oldIpAddress string = "int dummy=0"/>
                        <hostname string = ""/>
                        <nodeSecurityState state = "ON"/>
                        <boardId string = ""/>
                        <Protocol number = "0">
                            <protocolType string = "SNMP"/>
                            <port int = "161"/>
                            <protocolVersion string = "v2c"/>
                            <securityName string = ""/>
                            <authenticationMethod string = ""/>
                            <encryptionMethod string = ""/>
                            <communityString string = "public"/>
                            <context string = ""/>
                            <namingUrl string = ""/>
                            <namingPort int = ""/>
                            <notificationIRPAgentVersion string = ""/>
                            <alarmIRPAgentVersion string = ""/>
                            <notificationIRPNamingContext context = ""/>
                            <alarmIRPNamingContext context = ""/>
                        </Protocol>
                        <Protocol number = "1">
                            <protocolType string = "SSH"/>
                            <port int = "22"/>
                            <protocolVersion string = ""/>
                            <securityName string = ""/>
                            <authenticationMethod string = ""/>
                            <encryptionMethod string = ""/>
                            <communityString string = ""/>
                            <context string = ""/>
                            <namingUrl string = ""/>
                            <namingPort int = ""/>
                            <notificationIRPAgentVersion string = ""/>
                            <alarmIRPAgentVersion string = ""/>
                            <notificationIRPNamingContext context = ""/>
                            <alarmIRPNamingContext context = ""/>
                        </Protocol>
                        <Browser>
                            <browser string = ""/>
                            <browserURL string = ""/>
                            <bookname string = ""/>
                        </Browser>
                    </DEFAULT>
                </Connectivity>
                <Tss>
                    <Entry>
                        <System string = "siu009722"/>
                        <Type string = "NORMAL"/>
                        <User string = "admin"/>
                        <Password string = "siu009722"/>
                    </Entry>
                    <Entry>
                        <System string = "siu009722"/>
                        <Type string = "SECURE"/>
                        <User string = "admin"/>
                        <Password string = "siu009722"/>
                    </Entry>
                </Tss>
                <Relationship>
                    <AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=swstore-rtwaned1o" AssociationType = "ManagedElement_to_ftpSwStore"/>
                    <AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=cmdown-rtwaned1o" AssociationType = "ManagedElement_to_neTransientCmDown"/>
                    <AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=cmup-rtwaned1o" AssociationType = "ManagedElement_to_neTransientCmUp"/>
                    <AssociableNode TO_FDN = "FtpServer=SMRSSLAVE-rtwaned1o,FtpService=pmup-rtwaned1o" AssociationType = "ManagedElement_to_neTransientPm"/>
                    <AssociableNode TO_FDN = "ManagementNode=ONRM" AssociationType = "MgmtAssociation"/>
                    <AssociableNode TO_FDN = "SubNetwork=ZLNOUR3,MeContext=rbs009721,ManagedElement=1,NodeBFunction=1" FROM_FDN = "SubNetwork=ZLNOUR_SIU,ManagedElement=siu009722,StnFunction=STN_ManagedFunction" AssociationType = "StnFunction_to_NodeBFunction"/>
                </Relationship>
            </ManagedElement>
        </SubNetwork>
    </Create>
</Model>

【问题讨论】:

  • 是什么让您认为getElementsByTagName('ManagedElementId string = ') 会起作用?您只能找到标签名称,但 string = 不是标签名称的一部分。
  • 我建议反对使用 minidom。 DOM API 非常冗长、笨重且难以使用。请改用ElementTree API

标签: python xml minidom


【解决方案1】:

您在标签名称中包含一个属性名称:

model= xmldoc.getElementsByTagName('ManagedElementId string = ')

string = 不是标签名称的一部分;您的文档中没有此类标签。删除string = 部分:

>>> from xml.dom import minidom
>>> tree = minidom.parseString(sample)
>>> tree.getElementsByTagName('ManagedElementId')
[<DOM Element: ManagedElementId at 0x1080baef0>]

该元素没有子节点;它只有一个属性值:

>>> node = tree.getElementsByTagName('ManagedElementId')[0]
>>> node.firstChild is None
True
>>> node.getAttribute('string')
u'siu009722'

不过,我强烈建议您远离 XML DOM;你最好在这里使用更简单的ElementTree API

>>> from xml.etree import ElementTree as ET
>>> tree = ET.fromstring(sample)
>>> tree.find('.//ManagedElementId')
<Element 'ManagedElementId' at 0x1080af950>
>>> tree.find('.//ManagedElementId').get('string')
'siu009722'

【讨论】:

  • 好的,我会用 ET 试试,但是下面的例子我无法得到值,我需要打印什么的吗?
  • @user3319356:我使用了一个回显结果的交互式解释器。在脚本中,您将使用 print,是的。
猜你喜欢
  • 1970-01-01
  • 2015-05-31
  • 1970-01-01
  • 1970-01-01
  • 2017-09-22
  • 2018-10-23
  • 1970-01-01
  • 2015-02-18
  • 1970-01-01
相关资源
最近更新 更多