【发布时间】:2015-05-26 19:03:58
【问题描述】:
与我之前的帖子一样,我正在尝试解析 XML 文件。文件是:
<?xml version="1.0" ?>
<library>
<book>
<title>Sandman Volume 1: Preludes and Nocturnes</title>
<author>Neil Gaiman</author>
</book>
<book>
<title>Good Omens</title>
<author>Neil Gamain</author>
<author>Terry Pratchett</author>
</book>
<book>
<title>All the Lovely Things</title>
<author>James Daniel Wise</author>
</book>
<book>
<title>Beginning Python</title>
<author>Peter Norton, et al</author>
</book>
</library>
我的 Python 脚本是:
from xml.dom.minidom import parse
import xml.dom.minidom
import csv
def writeToCSV(myLibrary):
with open('csvout.csv', 'wb') as csvfile:
writer = csv.writer(csvfile, delimiter = ',')
writer.writerow(['title', 'author', 'author'])
books = myLibrary.getElementsByTagName("book")
for book in books:
titleValue = book.getElementsByTagName("title")[0].childNodes[0].data
authors =[]
for author in book.getElementsByTagName("author"):
authors.append(author.childNodes[0].data)
writer.writerow([titleValue] + authors)
doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]
# Call main function
writeToCSV(myLibrary)
这给了我这个输出:
title,author,author
Sandman Volume 1: Preludes and Nocturnes,Neil Gaiman
Good Omens,Neil Gamain,Terry Pratchett
All the Lovely Things,James Daniel Wise
Beginning Python,"Peter Norton, et al"
首先我想知道为什么它在最后一行“Peter Norton, et al”周围加上引号,以及如何摆脱这个!将 QUOTE_NONE 放在我的代码中可以防止此行被返回。
另外,我想添加另一个列标题“键”。我希望它由序列号填充,以给我这个输出:
key,title,author,author
1,Sandman Volume 1: Preludes and Nocturnes,Neil Gaiman
2,Good Omens,Neil Gamain,Terry Pratchett
3,All the Lovely Things,James Daniel Wise
4,Beginning Python,Peter Norton, et al
我尝试了各种方法,例如设置“key”变量 = 0,然后在我的 def 循环中执行 key=+1,但它不起作用。
【问题讨论】:
标签: python xml python-2.7 csv