Python re.findall() 未按预期工作-在锚点之间查找字符串 [重复]答案

【问题标题】：Python re.findall() not working as expected - Finding string between anchors [duplicate]Python re.findall() 未按预期工作-在锚点之间查找字符串 [重复]
【发布时间】：2019-11-18 13:32:16
【问题描述】：

我需要解析具有这种模式的文本：

Lorem ipsum, baby shark, do do doo

    Host: MyHostName

Blah, Blah

我正在尝试隔离行Host: MyHostName

在 regex101 中，这个正则表达式运行良好 (?<=Host:).*?(?=$) 但由于某种原因 Python 的 re.findall() 一直返回一个空列表。我已经以多种方式对其进行了调整，但似乎无法使其正常工作。

这里有什么我忽略了吗？？？

（注意：我使用的是 Python 3.6）

编辑我的代码在上下文中

import re
pattern = r'(?<=Host:)(.*)(?=$)' 
data = """ 
        Lorem Ipsum...
          Host: MyHostName
        """

x = re.findall(pattern, data)

【问题讨论】：

请显示您使用的整个代码。
在我的回答中添加多行
你不需要(?=$)，只需使用$，它不匹配任何东西。以及为什么要使用非贪婪的.*?，尤其是因为您似乎想排在最后？
@LogicalKip 当我转换为 $ 时再次返回空。
你首先不需要$。使用pattern = r'Host:\s*(.+)'

标签： python regex python-3.x

【解决方案1】：

import re

regex = r"(?<=Host:).*?(?=$)"

test_str = ("Lorem ipsum, baby shark, do do doo\n\n"
    "    Host: MyHostName\n\n"
    "Blah, Blah")

matches = re.findall(regex, test_str, re.MULTILINE)

print(matches)

【讨论】：

- 这似乎有效。我还有其他几个在没有 MULTILINE 的情况下可以正常工作的正则表达式 - 为什么我需要在这里？ ?
在多行的情况下，模式字符'$'在字符串的末尾和每行的末尾匹配
我会的，我只是想先了解这里发生了什么

【解决方案2】：

我会保持简单，只使用以下正则表达式模式：

\bHost: \S+

脚本：

text = """Lorem ipsum, baby shark, do do doo

    Host: MyHostName

Blah, Blah"""

matches = re.findall(r'\bHost: \S+', text)
print(matches)

打印出来：

['Host: MyHostName']

【讨论】：

- 这行得通，但我想指出它返回 [Host: MyHostName] 而不是 ['MyHostName]