Python 3 - 使用 beautifulSoup 在网页中查找文本答案

【问题标题】：Python 3 - Using beautifulSoup to find text in a webpagePython 3 - 使用 beautifulSoup 在网页中查找文本
【发布时间】：2015-07-31 23:56:13
【问题描述】：

我有这个代码

import requests
from bs4 import BeautifulSoup


url = "http://www.rockefeller.edu/research/areas/summary.php?id=1"
r = requests.get(url)
soup = BeautifulSoup(r.content)
a = 'Comments'
for x in (soup.find_all('p')):
    if a in x:
        print (x)
    else:
        print ('it is not there')

基本上，我想到了一个词，我想知道它在页面中的位置。可以说我的话是“评论”。我想知道那个词评论在哪里：能够打印出它包含的标签（例如：<a href=#>Comments</a>

更新的代码（对我不起作用）

import requests
from bs4 import BeautifulSoup
import re


url = "http://www.rockefeller.edu/research/areas/summary.php?id=1"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
for x in (soup.find_all(string=re.compile('comment', flags=re.I))):
    print(x.parent)
    print(x.parent.name)

【问题讨论】：

标签： python python-3.x beautifulsoup python-requests

【解决方案1】：

用编译正则表达式对象指定string关键字参数；它将返回包含文本的字符串对象；您可以使用parent 属性访问包含文本的标签：

import re

...

for x in soup.find_all(string=re.compile('comment', flags=re.I)):
    print(x.parent)
    print(x.parent.name)

【讨论】：

我尝试了你的代码，但仍然没有打印任何东西，你能帮我解决这个问题吗？
我是否错过了什么，因为当我跑步时，它什么也没打印。从 bs4 导入请求 import BeautifulSoup import re url = "rockefeller.edu/research/areas/summary.php?id=1" r = requests.get(url) soup = BeautifulSoup(r.content, 'html.parser') for x in (soup.find_all(string=re.compile ('comment', flags=re.I))): print(x.parent) print(x.parent.name)
@bob，注释中的代码很难阅读，因为它不保留缩进，并且某些单词被解释为标记。
@bob，您的代码为我打印 <a href="/about/comments">Comments</a> 和 a。
@bob，您使用哪个版本的 BeautifulSoup4？仅供参考，我使用的是 4.4.0

【解决方案2】：

我得到了答案，这里是：

for x in (soup.find_all(True,text=re.compile(r'comment', re.I))):
print(x)

【讨论】：