【问题标题】:Code being dropped from xml created using python从使用 python 创建的 xml 中删除的代码
【发布时间】:2012-02-23 20:54:11
【问题描述】:

我正在复制然后使用 python 更新元数据 xml 文件 - 这工作正常,除了原始元文件中的以下代码被删除

<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet type='text/xsl' href='ANZMeta.xsl'?>

它需要放在文件的开头。

PHP 中的答案是 @xml insertion at specific point of xml file,但我需要 Python 的解决方案。

代码和完整解释在我的原始帖子中,但我将这个问题分开,因为它与我原来的问题不同。 Search and replace multiple lines in xml/text files using python

谢谢,

完整代码

import os, xml, arcpy, shutil, datetime, Tkinter, tkFileDialog, tkSimpleDialog
from xml.etree import ElementTree as et 

path=os.getcwd()
RootDirectory=path
currentPath=path
arcpy.env.workspace = path
Count=0
DECLARATION = """<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type='text/xsl' href='ANZMeta.xsl'?>\n"""
Generated_XMLs=RootDirectory+'\GeneratedXML_LOG.txt'
f = open(Generated_XMLs, 'a')
f.write("Log of Metadata Creation Process - Update: "+str(datetime.datetime.now())+"\n")
f.close()

for root, dirs, files in os.walk(RootDirectory, topdown=False):
    #print root, dirs
    for directory in dirs:
        try:
            currentPath=os.path.join(root,directory)
        except:
            pass
        os.chdir(currentPath)
        arcpy.env.workspace = currentPath
        print currentPath
#def Create_xml(currentPath):

        FileList = arcpy.ListFeatureClasses()
        zone="_Zone"

        for File in FileList:
            Count+=1
            FileDesc_obj = arcpy.Describe(File)
            FileNm=FileDesc_obj.file
            check_meta=os.listdir(currentPath)
            existingXML=FileNm[:FileNm.find('.')]
            existingExtension=FileNm[FileNm.find('.'):]
            print "XML: "+existingXML
            #print check_meta
            #if  existingXML+'.xml' in check_meta:
            #newMetaFile='new'
            for f in check_meta:
                if f.startswith(existingXML) and f.endswith('.xml'):
                    print "exists, file name:", f
                    newMetaFile=FileNm+"_2012Metadata.xml"
                    try:
                        shutil.copy2(f, newMetaFile)
                    except:
                        pass
                    break
                else:
                    #print "Does not exist"
                    newMetaFile=FileNm+"_BaseMetadata.xml"

            print "New meta file: "+newMetaFile+ " for: "+File
            if newMetaFile.endswith('_BaseMetadata.xml'):        
                print "calling tkinter"
                root = Tkinter.Tk()
                root.withdraw()
                file = tkFileDialog.askopenfile(parent=root,mode='rb',title='Choose a xml base file to match with: '+File)
                if file != None:
                    metafile=os.path.abspath(file.name)
                    file.close()
                    #print metafile
                    shutil.copy2(metafile,newMetaFile)
                    print "copied"+metafile
                    root.destroy

                else:
                    shutil.copy2('L:\Data_Admin\QA\Metadata_python_toolset\Master_Metadata.xml', newMetaFile)
                    #root = Tkinter.Tk()
                    #root.withdraw()
                    #newTitle=tkSimpleDialog.askstring('title', 'prompt')
                    #root.destroy
                    #print newTitle

            print "Parsing meta file: "+newMetaFile
            tree=et.parse(newMetaFile)        
            print "Processing: "+str(File)

            for node in tree.findall('.//title'):
                node.text = str(FileNm)
            for node in tree.findall('.//procstep/srcused'):
                node.text = str(currentPath+"\\"+existingXML+".xml")
            dt=dt=str(datetime.datetime.now())
            for node in tree.findall('.//procstep/date'):
                node.text = str(dt[:10])
            for node in tree.findall('.//procstep/time'):
                node.text = str(dt[11:13]+dt[16:19])
            for node in tree.findall('.//metd/date'):
                node.text = str(dt[:10])
            for node in tree.findall('.//northbc'):
                node.text = str(FileDesc_obj.extent.YMax)
            for node in tree.findall('.//southbc'):
                node.text = str(FileDesc_obj.extent.YMin)
            for node in tree.findall('.//westbc'):
                node.text = str(FileDesc_obj.extent.XMin)
            for node in tree.findall('.//eastbc'):
                node.text = str(FileDesc_obj.extent.XMax)        
            for node in tree.findall('.//native/nondig/formname'):
                node.text = str(os.getcwd()+"\\"+File)
            for node in tree.findall('.//native/digform/formname'):
                node.text = str(FileDesc_obj.featureType)
            for node in tree.findall('.//avlform/nondig/formname'):
                node.text = str(FileDesc_obj.extension)
            for node in tree.findall('.//avlform/digform/formname'):
                node.text = str(float(os.path.getsize(File))/int(1024))+" KB"
            for node in tree.findall('.//theme'):
                node.text = str(FileDesc_obj.spatialReference.name +" ; EPSG: "+str(FileDesc_obj.spatialReference.factoryCode))
            print node.text
            projection_info=[]
            Zone=FileDesc_obj.spatialReference.name

            if "GCS" in str(FileDesc_obj.spatialReference.name):
                projection_info=[FileDesc_obj.spatialReference.GCSName, FileDesc_obj.spatialReference.angularUnitName, FileDesc_obj.spatialReference.datumName, FileDesc_obj.spatialReference.spheroidName]
                print "Geographic Coordinate system"
            else:
                projection_info=[FileDesc_obj.spatialReference.datumName, FileDesc_obj.spatialReference.spheroidName, FileDesc_obj.spatialReference.angularUnitName, Zone[Zone.rfind(zone)-3:]]
                print "Projected Coordinate system"
            x=0
            for node in tree.findall('.//spdom'):
                for node2 in node.findall('.//keyword'):
                    #print node2.text
                    node2.text = str(projection_info[x])
                    #print node2.text
                    x=x+1


            tree.write(newMetaFile)
            with open(newMetaFile, 'w') as output: # would be better to write to temp file and rename
                output.write(DECLARATION)
                tree.write(output, xml_declaration=False, encoding='utf-8') 
    # xml_declaration=False - don't write default declaration   

            f = open(Generated_XMLs, 'a')
            f.write(str(Count)+": "+File+"; "+newMetaFile+"; "+currentPath+";"+existingXML+"\n")
            f.close()



    #        Create_xml(currentPath)

来自 Wing IDE 的错误消息

xml.parsers.expat.ExpatError:未找到元素:第 3 行,第 0 列文件 "L:\Data_Admin\QA\Metadata_python_toolset\test2\update_Metadata1f.py", 第 78 行,在 tree=et.parse(newMetaFile) 文件中 “C:\Python26\ArcGIS10.0\Lib\xml\etree\ElementTree.py”,第 862 行,在 parse tree.parse(source, parser) 文件 “C:\Python26\ArcGIS10.0\Lib\xml\etree\ElementTree.py”,第 587 行,在 解析 self._root = parser.close() 文件 “C:\Python26\ArcGIS10.0\Lib\xml\etree\ElementTree.py”,第 1254 行,在 close self._parser.Parse("", 1) # 数据结束

【问题讨论】:

  • 向我们展示您当前的代码。事实上,您甚至没有告诉我们您使用的是哪个解析器。
  • 我已经添加了它——不想像其他问题那样在数据上加倍。最好的。
  • 您的 XML 文件中似乎有错误。还有一件事,在with open... 之前删除tree.write(newMetaFile)
  • 我无法追踪 xml 错误。有没有其他方法可以把这个声明放进去?

标签: python xml


【解决方案1】:

我也很难将 PI 添加到 ElementTree 文档的开头。我想出了一个解决方案,使用一个假根节点(以 None 作为元素标记)来保存任何所需的处理指令,然后是真正的文档根节点。

import xml.etree.ElementTree as ET

# Build your XML document as normal...
root = ET.Element('root')

# Create 'fake' root node
fake_root = ET.Element(None)

# Add desired processing instructions.  Repeat as necessary.
pi = ET.PI("xml-stylesheet", "type='text/xsl' href='ANZMeta.xsl'")
pi.tail = "\n"
fake_root.append(pi)

# Add real root as last child of fake root
fake_root.append(root)

# Write to file, using ElementTree.write( ) to generate <?xml ...?> tag.
tree = ET.ElementTree(fake_root)
tree.write("doc.xml", xml_declaration=True)

生成的 doc.xml 文件:

<?xml version='1.0' encoding='us-ascii'?>
<?xml-stylesheet type='text/xsl' href='ANZMeta.xsl'?>
<root />

【讨论】:

    【解决方案2】:

    如果你所有的xml文件都有相同的声明,你可以自己写:

    import xml.etree.ElementTree as ET
    
    
    DECLARATION = """<?xml version="1.0" encoding="utf-8"?>
    <?xml-stylesheet type='text/xsl' href='ANZMeta.xsl'?>\n"""
    
    tree = ET.parse(filename)
    # do some work on tree
    
    with open(filename, 'w') as output: # would be better to write to temp file and rename
        output.write(DECLARATION)
        tree.write(output, xml_declaration=False, encoding='utf-8') 
        # xml_declaration=False - don't write default declaration
    

    【讨论】:

    • 谢谢。它似乎只用声明覆盖了整个文件。我已将代码添加到问题中 - 您的示例位于第 130 行,您收到以下错误 - xml.parsers.expat.ExpatError: no element found: line 3
    • 我仍然得到文件 "C:\Python26\ArcGIS10.0\Lib\xml\etree\ElementTree.py", line 1254, in close self._parser.Parse("", 1) #数据结束
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-03-09
    • 1970-01-01
    • 1970-01-01
    • 2013-08-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多