【问题标题】:Python BS4 - NameError: name 'tagID' is not definedPython BS4 - NameError: name 'tagID' is not defined
【发布时间】:2023-03-11 21:22:01
【问题描述】:

我是 python 新手。我正在构建一个应用程序来解析和清理 MSWord 生成的 HTML。 在下面的代码中,我将内容作为 BS4 对象传递,并尝试使用新属性更新特定的 span 标签。

content = ' <html>
    <head></head>
    <body>
    <span style="background: #c0c0c0">Table 1</span>
    <span style="background: #c0c0c0">Figure 1</span>
    </body>
    </html>'

def clean_table_figure_id_tags(content):
    for element in content.findAll('span', style='background: #ccc0'):
        # inspect the existing tag a to determine table or figure
        if 'Table' in element.string:
            tagID = 'TableId'
        elif 'Figure' in element.string:
            tagID = 'FigureId'
        # tagID = content(elementString)
        newTag = Tag(builder=content.builder, name='span', attrs={'id': tagID, 'class': 'variable'})
        newTag.string = element.string
        element.replace_with(newTag)
    return content

但是,我收到以下错误:NameError: name 'tagID' is not defined 非常感谢任何帮助。

【问题讨论】:

标签: python-3.x beautifulsoup


【解决方案1】:

如果我理解正确,您正在寻找这样的东西:

from bs4 import BeautifulSoup as bs
content = """[your html above]"""
soup = bs(content,'lxml')

for elem in soup.select('span[style="background: #c0c0c0"]'):
    if "Table" in elem.text:
        elem.attrs['id'] = 'TableId'
    if "Figure" in elem.text:
        elem.attrs['id'] = 'FigureId'
print(soup.prettify())

输出:

<html>
 <head>
 </head>
 <body>
  <span id="TableId" style="background: #c0c0c0">
   Table 1
  </span>
  <span id="FigureId" style="background: #c0c0c0">
   Figure 1
  </span>
 </body>
</html>

【讨论】:

    猜你喜欢
    • 2022-12-02
    • 1970-01-01
    • 2023-01-02
    • 2013-04-09
    • 1970-01-01
    • 2015-01-20
    • 1970-01-01
    • 2020-11-07
    • 2017-01-09
    相关资源
    最近更新 更多