【发布时间】:2016-10-14 10:52:37
【问题描述】:
我正在尝试从传感器中提取数据(它与“xml 类型”字符串通信)并将其转换为 csv。 使用我的实际代码,我已经编写了 xml 文件,但是数据是单行的(从根到 /root)。
不知道这是不是这个原因,但我在文档元素之后收到 elementtree.parse 错误垃圾。到目前为止我读过的所有内容,问题出在 xml 构造中(多个根,没有根等),所以我对我的情况有点不知所措。
登录xml文件:
<li820><data><celltemp>5.1120729e1</celltemp><cellpres>9.7705745e1</cellpres><co2>7.7808494e2</co2><co2abs>5.0983281e-2</co2abs><ivolt>1.1380004e1</ivolt><raw>2726238,1977386</raw></data></li820>
<li820><data><celltemp>5.1120729e1</celltemp><cellpres>9.7684698e1</cellpres><co2>7.7823929e2</co2><co2abs>5.0991268e-2</co2abs><ivolt>1.1380004e1</ivolt><raw>2725850,1976922</raw></data></li820>
<li820><data><celltemp>5.1120729e1</celltemp><cellpres>9.7705745e1</cellpres><co2>7.7797288e2</co2><co2abs>5.0977463e-2</co2abs><ivolt>1.1373291e1</ivolt><raw>2726166,1977001</raw></data></li820>
上一行的内容(在树视图中):
<li820>
<data>
<celltemp>1.9523970e1</celltemp>
<cellpres>9.8993663e1</cellpres>
<co2>3.5942180e4</co2>
<co2abs>4.0364418e-1</co2abs>
<ivolt>1.1802978e1</ivolt>
<raw>2789123,1884335</raw>
</data>
</li820>
错误:
Traceback (most recent call last):
File "licor_read.py", line 96, in <module>
tree = et.parse(file_xml) # Set XML Parser
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: junk after document element: line 2, column 0
我的代码:
import os, sys, subprocess
import time, datetime
import serial
import string
import glob
import csv
import xml.etree.ElementTree as et
from xml.etree.ElementTree import XMLParser, XML, fromstring, tostring
from os import path
from bs4 import BeautifulSoup as bs
#-------------------------------------------------------------
#------------------ Open configurations ----------------------
#-------------------------------------------------------------
############
# Settings #
############
DEBUG = True
LOG = True
FREQ = 1
PORT = '/dev/ttyUSB0'
BAUD = 9600
PARITY = 'N'
STOPBIT = 1
BYTE_SZ = 8
TIMEOUT = 5.0
log_dir = 'logs/'
out_dir = 'export/'
fname_xml = 'licor820-data-{}.xml'.format(datetime.datetime.now()) # DO NOT touch the {} brackets
fname_csv = 'licor820-data-{}.csv'.format(datetime.datetime.now()) #
isLooping = 20 # Nr of data extractions
isHeader = True # Do not touch if data headers are required
isBegin = False
#-------------------------------------------------------------
#----- Better know what you are doing from this point --------
#-------------------------------------------------------------
##################
# Initialisation #
##################
file_xml = os.path.join(log_dir, fname_xml) # Define path and file name
file_csv = os.path.join(out_dir, fname_csv) #
fp_xml = open(file_xml, 'w') # Open writing streams
fp_csv = open(file_csv, 'w') #
try:
buff = serial.Serial(PORT, BAUD, BYTE_SZ, PARITY, STOPBIT, TIMEOUT) # Open Serial connection
except Exception as e:
if DEBUG:
print ("ERROR: {}".format(e))
sys.exit("Could not connect to the Licor")
csv_writer = csv.writer(fp_csv) # Define CSV writer
instruct_head = [] # ''
################
# Main program #
################
while isLooping : # Define nr of refreshed data extracted
#os.system('clear')
print('RAW/XML in progress... ' + str(isLooping)) # Debug this loop
if(isBegin is False) : # Verify presence of the <licor> tag
while(buff.readline()[0] is not '<' and buff.readline()[1] is not 'l') :
raw_output = buff.readline() # Jump the lines readed until <licor>
isBegin = True
raw_output = buff.readline()
xml_output = raw_output
print(xml_output)
fp_xml.write(xml_output) # Write from serial port to xml
isLooping -= 1
fp_xml.close()
tree = et.parse(file_xml) # Set XML Parser
root = tree.getroot() # ''
for instruct_row in root.findall('li820'): # XML to CSV buffer
instruct = []
if isHeader is True: # Buffering header
celltemp = instruct_row.find('celltemp').tag
instruct_head.append(celltemp)
cellpres = instruct_row.find('cellpres').tag
instruct_head.append(cellpres)
co2 = instruct_row.find('co2').tag
instruct_head.append(co2)
co2abs = instruct_row.find('co2abs').tag
instruct_head.append(co2abs)
ivolt = instruct_row.find('ivolt').tag
instruct_head.append(ivolt)
raw = instruct_row.find('raw').tag
instruct_head.append(raw)
csv_writer.writerow(instruct_head) # Write header
isHeader = False
celltemp = instruct_row.find('celltemp').text # Buffering data
instruct.append(celltemp)
cellpres = instruct_row.find('cellpres').text
instruct.append(cellpres)
co2 = instruct_row.find('co2').text
instruct.append(co2)
co2abs = instruct_row.find('co2abs').text
instruct.append(co2abs)
ivolt = instruct_row.find('ivolt').text
instruct.append(ivolt)
raw = instruct_row.find('raw').text
instruct.append(raw)
csv_writer.writerow(instruct) # Write data'''
csv_writer.close()
fp_csv.close()
os.system('clear')
print('Job done. \nSaved at : ./' + file_xml + '\nAnd at ./' + file_csv + '\n')
【问题讨论】:
-
那是先 sn-p 你的 XML 输入吗?它不是格式良好的 XML,因为它缺少包含所有其他元素的根元素,所以我认为这就是导致错误消息的原因。
-
根是
每一行都是一棵完整的树 -
正如@MartinHonnen 所提到的,您的 XML 需要将 all 行包装在根元素中以形成良好的格式,而不仅仅是单个行:
<root>...content...</root>。所以每个<li820>都是root 的孩子。 -
确实在改变我的xml结构后起作用了,谢谢大家