正则表达式检测双花括号之间的文本答案

【问题标题】：regular expression to detect text in between double curly braces正则表达式检测双花括号之间的文本
【发布时间】：2014-01-05 17:19:36
【问题描述】：

使用正则表达式，我想检测开始和结束双花括号之间的文本/字符串，它应该检测任何内部花括号以及文本。

例如：

{{detect this {{and this as well}} text}} but text does not ends here so it should {{not detect this}}.

我已经写了这个正则表达式

\{\{[\s\S]+\}\}

但这会选择整个字符串 FROM {{detect this.... TO {{not detect this}}

注意：我正在为此使用 python re

【问题讨论】：

你需要解析这个，你不能RegEx这个...
@thefourtheye 这不是一种常规语言，但如果可以用re 识别，我一点也不感到惊讶。现代编程中称为“正则表达式”的大多数东西都比有限自动机更强大。这是否可取完全是另一个问题。
如果你有来自 python 3.X 的regex 模块，你可以使用this。
@delnan：不，python 正则表达式是不可能的，因为它不支持递归。
@HamZa 我知道，只是我不想使用负前瞻并保持简单^^;

标签： python regex string

【解决方案1】：

Pyparsing 允许您定义递归语法，但有一些内置的帮助器来处理像这样的常见语法。请参阅下面的注释代码示例：

from pyparsing import nestedExpr, ungroup, originalTextFor

# use nestedExpr to define a default expression with left-right nesting markers
nestedText = ungroup(nestedExpr('{{','}}'))

sample = """{{detect this {{and this as well}} text}} but text does not ends here so it should {{not detect this}}."""

# note how reporting the results as a list keeps the nesting of {{ }}'s
print nestedText.parseString(sample).asList()
# prints ['detect', 'this', ['and', 'this', 'as', 'well'], 'text']

# if you just want the string itself, wrap with 'originalTextFor'
print originalTextFor(nestedText).parseString(sample)[0]
# prints {{detect this {{and this as well}} text}}

【讨论】：

【解决方案2】：

首先{{[\s\S]+}}（几乎）与{{.+}} 相同。原因：\s 包含所有空格，\S 包含所有非空格。我通常会避免[] 中的大写字符类，它主要会引起混淆。

其次：我认为我支持 thefourtheye，我不能很快想到一个正则表达式来解决您的问题。

【讨论】：

在许多正则表达式风格中，. 不匹配没有特殊标志（在某些语言中不可用的标志）的换行符，因此 [\s\S] 和 [^] 通常是解决此问题的方法.
是的，但是当 OP 正在谈论使用 python re 时，dotall 有一个标志，所以我认为编写 .+ 并激活 dotall 会更“干净”，因为它更明显.
没错，大多数语言都有dotall 标志，这确实是一种更简洁的解决方案，但有时（取决于用例/正则表达式）您可能希望. 不匹配换行符，或者保持表达式可移植到另一种给定语言。我刚刚评论了可能访问regex标签的未来初学者。