如何遍历 HTML 文件中的元素答案

【问题标题】：How to iterate through elements in an HTML file如何遍历 HTML 文件中的元素
【发布时间】：2020-04-13 02:59:01
【问题描述】：

这是我正在查看的页面：https://www.nytimes.com/topic/destination/russia

我已导入 BeautifulSoup 和请求。我想创建一个包含此页面所有标题的文本文件。我可以得到其中一个，使用

from bs4 import BeautifulSoup
import requests
source = requests.get('https://www.nytimes.com/topic/destination/russia').text
soup = BeautifulSoup(source, 'lxml')
headline = soup.find('h2').get_text()
print(headline)

产生：

When an Oil Price War Meets Coronavirus Fears, Markets Get Punched in the Face

一切都好。但是，我完全不知道如何遍历和收集页面中的所有标题。任何帮助将不胜感激。

【问题讨论】：

这能回答你的问题吗？ Python beautifulsoup iterate over table
stackoverflow.com/search?q=Python+beautifulsoup+iterate
你能澄清一下到底是什么问题吗？听起来你只需要学习如何使用 BeautifulSoup。
我的回答对你有帮助吗？如果是这样，请不要忘记单击我的答案旁边的勾：)

标签： python html web-scraping

【解决方案1】：

使用find_all() 获取所有标题。

使用for 循环从每个循环中获取文本并打印。

from bs4 import BeautifulSoup
import requests
source = requests.get('https://www.nytimes.com/topic/destination/russia').text
soup = BeautifulSoup(source)
headings = soup.find_all('h2')
for h in headings:
    heading = h.get_text()
    print(heading)

【讨论】：

【解决方案2】：

试试：

for headline in soup.find_all('h2'):
    print(healdine.get_text())

find_all 将所有<h2> 标记作为列表返回。现在遍历它。

【讨论】：