检查“RegEx”中是否满足两个“lookbehind”条件答案

【问题标题】：Check that both `lookbehind` conditions are satisfied in `RegEx`检查“RegEx”中是否满足两个“lookbehind”条件
【发布时间】：2020-12-07 08:17:05
【问题描述】：

我正在尝试通过使用与条件配对的lookbehind 机制来检查用户名前面是否有RT @ 或RT@，如this 教程中所述。正则表达式和示例显示在Example 1:

示例 1

import re

text = 'RT @u1, @u2, u3, @u4, rt @u5:, @u3.@u1^, rt@u3'

mt_regex = r'(?i)(?<!RT )&(?<!RT)@(\w+)'

mt_pat = re.compile(mt_regex)

re.findall(mt_pat, text)

输出[]（空列表），而期望的输出应该是：

['u2', 'u4', 'u3', 'u1']

我错过了什么？提前致谢。

【问题讨论】：

标签： python-3.x regex python-re

【解决方案1】：

另一个答案显然是正确的并且被理所当然地接受了，但我认为这可能对你有用，而不需要消极的回顾。好处是你不限于使用\s*的单个空格字符：

(?i)(?:^|[,.])\s*@(\w+)

在线查看demo

(?i) - 区分大小写。请注意，您也可以使用re.IGNORECASE。
(?:^|[,.]) - 非捕获组以匹配字符串或文字逗号/点的开头。
\s* - 零个或多个空格。
@ - 字面上匹配“@”。
(\w+) - 打开捕获组并匹配单词字符，[A-Za-z0-9_] 的缩写。

此打印['u2', 'u4', 'u3', 'u1']

【讨论】：

这是一个非常好的解决方案！

【解决方案2】：

如果我们分解您的正则表达式：

r"(?i)(?<!RT )&(?<!RT)@(\w+)"
(?i)        match the remainder of the pattern, case insensitive match
(?<!RT )    negative lookbehind
            asserts that 'RT ' does not match
&           matches the character '&' literally
(?<!RT)     negative lookbehind 
            asserts that 'RT' does not match
@           matches the character '@' literally
(\w+)       Capturing Group    
            matches [a-zA-Z0-9_] between one and unlimited times

您的 & 字符阻止了您的正则表达式匹配：

import re

text = "RT @u1, @u2, u3, @u4, rt @u5:, @u3.@u1^, rt@u3"
mt_regex = r"(?i)(?<!RT )(?<!RT)@(\w+)"
mt_pat = re.compile(mt_regex)

print(re.findall(mt_pat, text))
# ['u2', 'u4', 'u3', 'u1']

查看这个正则表达式here

【讨论】：