【问题标题】:XML Processing not workingXML 处理不工作
【发布时间】:2016-10-14 10:52:37
【问题描述】:

我正在尝试从传感器中提取数据(它与“xml 类型”字符串通信)并将其转换为 csv。 使用我的实际代码,我已经编写了 xml 文件,但是数据是单行的(从根到 /root)。

不知道这是不是这个原因,但我在文档元素之后收到 elementtree.parse 错误垃圾。到目前为止我读过的所有内容,问题出在 xml 构造中(多个根,没有根等),所以我对我的情况有点不知所措。

登录xml文件:

<li820><data><celltemp>5.1120729e1</celltemp><cellpres>9.7705745e1</cellpres><co2>7.7808494e2</co2><co2abs>5.0983281e-2</co2abs><ivolt>1.1380004e1</ivolt><raw>2726238,1977386</raw></data></li820>
<li820><data><celltemp>5.1120729e1</celltemp><cellpres>9.7684698e1</cellpres><co2>7.7823929e2</co2><co2abs>5.0991268e-2</co2abs><ivolt>1.1380004e1</ivolt><raw>2725850,1976922</raw></data></li820>
<li820><data><celltemp>5.1120729e1</celltemp><cellpres>9.7705745e1</cellpres><co2>7.7797288e2</co2><co2abs>5.0977463e-2</co2abs><ivolt>1.1373291e1</ivolt><raw>2726166,1977001</raw></data></li820>

上一行的内容(在树视图中):

<li820>
    <data>
        <celltemp>1.9523970e1</celltemp>
        <cellpres>9.8993663e1</cellpres>
        <co2>3.5942180e4</co2>
        <co2abs>4.0364418e-1</co2abs>
        <ivolt>1.1802978e1</ivolt>
        <raw>2789123,1884335</raw>
    </data>
</li820>

错误:

Traceback (most recent call last):
 File "licor_read.py", line 96, in <module>
  tree = et.parse(file_xml)              # Set XML Parser
 File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
  tree.parse(source, parser)
 File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
  parser.feed(data)
 File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
  self._raiseerror(v)
 File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
  raise err
xml.etree.ElementTree.ParseError: junk after document element: line 2, column 0

我的代码:

import os, sys, subprocess
import time, datetime
import serial
import string
import glob
import csv
import xml.etree.ElementTree  as et

from xml.etree.ElementTree import XMLParser, XML, fromstring, tostring
from os  import path
from bs4 import BeautifulSoup as bs


#-------------------------------------------------------------
#------------------ Open configurations ----------------------
#-------------------------------------------------------------

  ############
  # Settings #
  ############

DEBUG   = True
LOG     = True
FREQ    = 1
PORT    = '/dev/ttyUSB0'
BAUD    = 9600
PARITY  = 'N'
STOPBIT = 1
BYTE_SZ = 8
TIMEOUT = 5.0

log_dir   = 'logs/'
out_dir   = 'export/'
fname_xml = 'licor820-data-{}.xml'.format(datetime.datetime.now())            # DO NOT touch the {} brackets
fname_csv = 'licor820-data-{}.csv'.format(datetime.datetime.now())            #
isLooping = 20                                                                # Nr of data extractions
isHeader  = True                                                              # Do not touch if data headers are required
isBegin   = False


#-------------------------------------------------------------
#----- Better know what you are doing from this point --------
#-------------------------------------------------------------

  ##################
  # Initialisation #
  ##################

file_xml = os.path.join(log_dir, fname_xml)                                   # Define path and file name
file_csv = os.path.join(out_dir, fname_csv)                                   #
fp_xml   = open(file_xml, 'w')                                                # Open writing streams
fp_csv   = open(file_csv, 'w')                                                #

try:
  buff = serial.Serial(PORT, BAUD, BYTE_SZ, PARITY, STOPBIT, TIMEOUT)         # Open Serial connection
except Exception as e:
  if DEBUG:
    print ("ERROR: {}".format(e))

  sys.exit("Could not connect to the Licor")

csv_writer    = csv.writer(fp_csv)                                            # Define CSV writer
instruct_head = []                                                            # '' 


  ################
  # Main program #
  ################

while isLooping :                                                             # Define nr of refreshed data extracted
  #os.system('clear')
  print('RAW/XML in progress... ' + str(isLooping))                           # Debug this loop

  if(isBegin is False) :                                                      # Verify presence of the <licor> tag
    while(buff.readline()[0] is not '<' and buff.readline()[1] is not 'l') :    
      raw_output = buff.readline()                                            # Jump the lines readed until <licor>
  isBegin = True

  raw_output = buff.readline()
  xml_output = raw_output

  print(xml_output)
  fp_xml.write(xml_output)                                                    # Write from serial port to xml 

  isLooping -= 1

fp_xml.close()

tree = et.parse(file_xml)                                                     # Set XML Parser
root = tree.getroot()                                                         # ''

for instruct_row in root.findall('li820'):                                    # XML to CSV buffer
  instruct = []

  if isHeader is True:                                                        # Buffering header
    celltemp = instruct_row.find('celltemp').tag
    instruct_head.append(celltemp)
    cellpres = instruct_row.find('cellpres').tag
    instruct_head.append(cellpres)      
    co2 = instruct_row.find('co2').tag
    instruct_head.append(co2)      
    co2abs = instruct_row.find('co2abs').tag
    instruct_head.append(co2abs)      
    ivolt = instruct_row.find('ivolt').tag
    instruct_head.append(ivolt)      
    raw = instruct_row.find('raw').tag
    instruct_head.append(raw)

    csv_writer.writerow(instruct_head)                                        # Write header      
    isHeader = False

  celltemp = instruct_row.find('celltemp').text                               # Buffering data
  instruct.append(celltemp)    
  cellpres = instruct_row.find('cellpres').text
  instruct.append(cellpres)    
  co2 = instruct_row.find('co2').text
  instruct.append(co2)    
  co2abs = instruct_row.find('co2abs').text
  instruct.append(co2abs)      
  ivolt = instruct_row.find('ivolt').text
  instruct.append(ivolt)      
  raw = instruct_row.find('raw').text
  instruct.append(raw)

  csv_writer.writerow(instruct)                                             # Write data'''

csv_writer.close()
fp_csv.close()
os.system('clear')

print('Job done. \nSaved at : ./' + file_xml + '\nAnd at ./' + file_csv + '\n')

【问题讨论】:

  • 那是先 sn-p 你的 XML 输入吗?它不是格式良好的 XML,因为它缺少包含所有其他元素的根元素,所以我认为这就是导致错误消息的原因。
  • 根是每一行都是一棵完整的树
  • 正如@MartinHonnen 所提到的,您的 XML 需要将 all 行包装在根元素中以形成良好的格式,而不仅仅是单个行:&lt;root&gt;...content...&lt;/root&gt;。所以每个&lt;li820&gt; 都是root 的孩子。
  • 确实在改变我的xml结构后起作用了,谢谢大家

标签: python xml csv


【解决方案1】:

您应该通过“读取”而不是“写入”打开输入文件。否则,您将在运行代码时清空文件。

fp_xml   = open(file_xml, 'r');

另外,我有一个更好的方法来获取所有元素。你不需要提前知道所有标签的名称。

header = []
isHeader = True
for instruct_row in root.getchildren():  # XML to CSV buffer
    instruct = []

    for i,item in enumerate(instruct_row.getchildren()):
        if isHeader is True:
            header.append(item.tag)
        instruct.append(item.text)
    isHeader = False
    csv_writer.writerow(instruct)# Write data'''
fp_csv.close()

我的输入xml如下:

<li820>
<data><celltemp>5.1120729e1</celltemp><cellpres>9.7705745e1</cellpres><co2>7.7808494e2</co2><co2abs>5.0983281e-2</co2abs><ivolt>1.1380004e1</ivolt><raw>2726238,1977386</raw></data>
<data><celltemp>5.1120729e1</celltemp><cellpres>9.7705745e1</cellpres><co2>7.7808494e2</co2><co2abs>5.0983281e-2</co2abs><ivolt>1.1380004e1</ivolt><raw>2726238,1977386</raw></data>
</li820>

你终于可以在你的 csv 文件中看到数据了。

【讨论】:

  • 太棒了,今天早上在工作中尝试过,效果很好。
  • @LaurentFournier 很高兴帮助并采纳我的回答。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-07-02
  • 2014-10-07
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多