【问题标题】:Regex contains words if no negative words are before it [duplicate]如果前面没有否定词,则正则表达式包含词[重复]
【发布时间】:2020-12-27 00:52:12
【问题描述】:

我想抓取说 goodgreat 但未被 notisn't 否定的短语 之前

sents= ["good words",                   # Words after phrase
        "not good words",
        "isn't good words",

        "great words",
        "not great words",
        "isn't great words",



        "words good",                   # Words before phrase
        "words not good",
        "words isn't good"

        "words great",
        "words not great",
        "words isn't great"


        
        "words good words",             # Words before and after phrase
        "words not good words",
        "words isn't good words",

        "words great words",
        "words not great words",
        "words isn't great words",
]

我想回来

good words
words good
words good words

great words
words great
words great words

让我这样做的正则表达式是什么?从理论上讲,我希望能够有一个单词列表,只有当字符串不包含任何来自否定列表的单词时才能找到它。

【问题讨论】:

    标签: python regex


    【解决方案1】:

    您可以在 python 中将此正则表达式与 2 个否定的后向断言一起使用:

    (?<!isn't )(?<!not )\b(?:good|great)\b
    

    RegEx Demo

    正则表达式详细信息:

    • (?&lt;!isn't ):如果我们有 isn't 后面跟一个空格,则负向后看会导致匹配失败
    • (?&lt;!not ):如果我们有 not 后面跟一个空格,则负向后看会导致匹配失败
    • \b:字边界
    • (?:good|great):匹配 goodgreat
    • \b:字边界

    代码:

    >>> sents= ["good words",                   # Words after phrase
    ...         "not good words",
    ...         "isn't good words",
    ...         "great words",
    ...         "not great words",
    ...         "isn't great words",
    ...         "words good",                   # Words before phrase
    ...         "words not good",
    ...         "words isn't good",
    ...         "words great",
    ...         "words not great",
    ...         "words isn't great",
    ...         "words good words",             # Words before and after phrase
    ...         "words not good words",
    ...         "words isn't good words",
    ...         "words great words",
    ...         "words not great words",
    ...         "words isn't great words",
    ... ]
    >>> reg = re.compile(r"(?<!isn't )(?<!not )\b(?:good|great)\b")
    >>> for s in sents:
    ...     if reg.search(s):
    ...             print(s)
    ...
    good words
    great words
    words good
    words great
    words good words
    words great words
    

    【讨论】:

      【解决方案2】:

      您需要使用 look behind,在本例中为 negative,因为也有 positive 的版本。你可以像这样简单地使用它:

      (?<!not\s)great
      

      在此示例中,单词 not 不能存在于 great 之前。

      下面是它的样子:

      (?<!not\s)(?<!isn't\s)(great|good)
      

      Online Demo

      【讨论】:

        猜你喜欢
        • 2017-05-26
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2017-07-25
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多