【发布时间】:2013-02-04 08:54:03
【问题描述】:
这道题是针对BeautifulSoup4的,这使得它与之前的问题有所不同:
Why is BeautifulSoup modifying my self-closing elements?
selfClosingTags in BeautifulSoup
既然BeautifulStoneSoup 已经消失(以前的xml 解析器),我怎样才能让bs4 尊重一个新的自闭合标签?例如:
import bs4
S = '''<foo> <bar a="3"/> </foo>'''
soup = bs4.BeautifulSoup(S, selfClosingTags=['bar'])
print soup.prettify()
不会自动关闭bar 标签,但会给出提示。 bs4 所指的这个树生成器是什么以及如何自我关闭标签?
/usr/local/lib/python2.7/dist-packages/bs4/__init__.py:112: UserWarning: BS4 does not respect the selfClosingTags argument to the BeautifulSoup constructor. The tree builder is responsible for understanding self-closing tags.
"BS4 does not respect the selfClosingTags argument to the "
<html>
<body>
<foo>
<bar a="3">
</bar>
</foo>
</body>
</html>
【问题讨论】:
标签: python xml xml-parsing beautifulsoup