如何从使用 BeautifulSoup 抓取的列表中删除标签？答案

【问题标题】：How can I remove tags from a list scraped with BeautifulSoup?如何从使用 BeautifulSoup 抓取的列表中删除标签？
【发布时间】：2020-10-25 23:58:54
【问题描述】：

我正在通过查找标签来提取用 beautifulsoup 抓取的页面的“标题”：

title = [text.find_all('h1', {'class', 'entry-title'}) for text in texts]

输出是这样的列表：

[[<h1 class="entry-title">Receita de pão caseiro fácil para iniciantes</h1>],
 [<h1 class="entry-title">Pão branco com fermentação natural</h1>],... etc]

我想从列表中删除和

我该怎么做？

【问题讨论】：

标签： python html beautifulsoup tags

【解决方案1】：

您可以使用 extract() 或 decompose() 函数来做到这一点。

【讨论】：

首先你应该搜索并尝试去做。如果你有问题，你应该发布代码
感谢您的回答！我试图用 for 循环来做到这一点，就像它在文档中一样，但我没有工作。你能举一些如何使用这个函数的例子吗？
Alejandro Gonzalez，我尝试使用 Micheal Macaulay 提供的那些功能，并尝试使用 RegEx，但它没有用！

【解决方案2】：

title = []
for text in texts:
    temp=[]
    for texData in text.find_all('h1', attrs = {'class': 'entry-title'}):
        temp.append(texData.get_text())
    title.append(temp)

【讨论】：

对于长期价值，请在您的代码中添加说明。所以只对代码不满意。