【发布时间】:2021-12-02 18:35:28
【问题描述】:
大家好,来自 Stack Overflow 的朋友们,
我想使用 python 将 csv 转换为 xml,我听说 pandas 可以非常简单地管理此任务.
事实证明它并不那么容易。
我的代码是什么样的:
import pandas as pd
import chardet
from pandas.core.frame import DataFrame
csvFile = '172431-82056.csv'
xmlFile = 'mySecondData.xml'
def check_encoding(filename):
"""
input: filename = "filename.csv"
output: Dictionary = {'encoding': 'UTF-16', 'confidence': 1.0, 'language': ''}
"""
result= {}
with open(filename, 'rb') as rawdata:
result = chardet.detect(rawdata.read(10000))
return result
def import_csv(filename):
"""
input: filename = "filename.csv"
output: Dictionary = {'csv key': 'csv data', ... }
"""
encoding = check_encoding(filename)['encoding']
csv_data = pd.read_csv(filename, engine ='python', encoding=encoding, sep = None)
#print(csv_data)
return csv_data
#print(import_csv(csvFile))
def convert_to_xml(input_file, output_file):
csv_data = import_csv(input_file)
csv_data.to_xml(path_or_buffer=output_file, index = True, root_name='products',row_name='item', elem_cols=['post_title','regular_price'], prefix = 'g:', pretty_print=True)
convert_to_xml(csvFile, xmlFile)
我的输出是什么样的:
Traceback (most recent call last):
File "c:\Users\PavelH\Documents\Git\CSV Converter\csv_converter.py", line 53, in <module>
convert_to_xml(csvFile, xmlFile)
File "c:\Users\PavelH\Documents\Git\CSV Converter\csv_converter.py", line 51, in convert_to_xml
df.to_xml(path_or_buffer=output_file, index = True, root_name='products',row_name='item', prefix = 'g:', pretty_print=True)
File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 2986, in to_xml
return xml_formatter.write_output()
File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\formats\xml.py", line 265, in write_output
xml_doc = self.build_tree()
File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\formats\xml.py", line 485, in build_tree
self.build_elems()
File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\formats\xml.py", line 575, in build_elems
SubElement(self.elem_row, elem_name).text = val
File "src\lxml\etree.pyx", line 3136, in lxml.etree.SubElement
File "src\lxml\apihelpers.pxi", line 179, in lxml.etree._makeSubElement
File "src\lxml\apihelpers.pxi", line 1734, in lxml.etree._tagValidOrRaise
ValueError: Invalid tag name 'foo bar'
带空格的标签是否无效?
【问题讨论】:
-
csv 长什么样子? xml应该是什么样子的?在帖子中添加代码 sn-ps。
-
是的,XML 节点名称不能有空格
标签: python pandas xml dataframe csv