【发布时间】:2016-08-28 14:22:21
【问题描述】:
这是我的xml 文件:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE papers>
<papers>
<paper>
<title>Title containing & and more</title>
</paper>
</papers>
我如何使用lxml 的etree 阅读?我试过了
from lxml import etree
with open(xml_file, 'r') as inf:
tree = etree.parse(inf)
但它会导致以下 Traceback:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "lxml.etree.pyx", line 3239, in lxml.etree.parse (src/lxml/lxml.etree.c:69955)
File "parser.pxi", line 1769, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:102257)
File "parser.pxi", line 1789, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:102516)
File "parser.pxi", line 1684, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:101442)
File "parser.pxi", line 1134, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:97069)
File "parser.pxi", line 582, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:91275)
File "parser.pxi", line 683, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:92461)
File "parser.pxi", line 622, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:91757)
lxml.etree.XMLSyntaxError: xmlParseEntityRef: no name, line 5, column 30
【问题讨论】:
-
我通过 xmllint 运行了您的 XML 文件,并在 & 符号处出现错误。这意味着您的 XML 格式不正确。
-
用
&amp;转义 -
我无法更改文件。
-
@MERose, ...文件错误。正如现在所写的那样,它不是有效的 XML,因此不是真正的“XML 文件”。使用创建它的任何软件提交错误。