你如何理解写在一行中的正则表达式？答案

【问题标题】：How do you understand regular expressions that are written in one line?你如何理解写在一行中的正则表达式？
【发布时间】：2010-08-29 20:20:18
【问题描述】：

这是一个整洁的、有据可查的正则表达式，易于理解、维护和修改。

    text = text.replace(/
    (                               // Wrap whole match in $1
        (
            ^[ \t]*>[ \t]?          // '>' at the start of a line
            .+\n                    // rest of the first line
            (.+\n)*                 // subsequent consecutive lines
            \n*                     // blanks
        )+
    )
    /gm,

但是你如何处理这些？

text = text.replace(/((^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+)/gm,

是否有某种beautifier 可以理解并描述其功能？

【问题讨论】：

也许stackoverflow.com/questions/32282/regex-testing-tools 中的一些工具会有所帮助。
我都不做。我总是使用 6-10 行代码和 explode/join/strstr/substr (PHP) 来代替。更易于理解、维护甚至编写。
由于添加了空格，并非所有支持正则表达式的语言或库都能像您的示例一样干净利落。

标签： regex maintainability

【解决方案1】：

努力熟练地阅读单行形式的正则表达式是值得的。大部分时间都是这样写的

【讨论】：

是的，就像任何编程语言的语法一样，过了一会儿它就变得可读了。

【解决方案2】：

RegexBuddy 将为您“翻译”任何正则表达式。当输入您的示例正则表达式时，它会输出：

((^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+)

Options: ^ and $ match at line breaks

Match the regular expression below and capture its match into backreference number 1 «((^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+)»
   Match the regular expression below and capture its match into backreference number 2 «(^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Note: You repeated the capturing group itself.  The group will capture only the last iteration.  
          Put a capturing group around the repeated group to capture all iterations. «+»
      Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
      Match a single character present in the list below «[ \t]*»
         Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
         The character “ ” « »
         A tab character «\t»
      Match the character “>” literally «>»
      Match a single character present in the list below «[ \t]?»
         Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
         The character “ ” « »
         A tab character «\t»
      Match any single character that is not a line break character «.+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match a line feed character «\n»
      Match the regular expression below and capture its match into backreference number 3 «(.+\n)*»
         Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
         Note: You repeated the capturing group itself.  The group will capture only the last iteration.  
             Put a capturing group around the repeated group to capture all iterations. «*»
         Match any single character that is not a line break character «.+»
            Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
         Match a line feed character «\n»
      Match a line feed character «\n*»
         Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»

这在文本形式中看起来确实很吓人，但在 HTML 形式（无法在此处复制）或 RegexBuddy 本身中更具可读性。它还指出了常见的问题（例如，这里可能不需要重复捕获组）。

【讨论】：

哇，太冗长了。但可能有助于学习正则表达式。
坦率地说，我发现这些繁重的解释并不比单行正则表达式更容易理解。

【解决方案3】：

我喜欢expresso

【讨论】：

【解决方案4】：

一段时间后，我已经习惯了阅读这些东西。大多数正则表达式并不多，如果您想更频繁地使用它们，我推荐网站http://www.regular-expressions.info/。

【讨论】：

【解决方案5】：

正则表达式只是表达掩码等的一种方式。归根结底，它只是一种具有自己语法的“语言”。
注释正则表达式的每一点与注释项目的每一行都是一样的。
当然，它会帮助那些不理解你的代码的人，但如果你（开发者）理解正则表达式的含义，它就毫无用处。

对我来说，阅读正则表达式与阅读代码是一回事。如果表达式真的很复杂，下面的解释可能很有用，但大多数时候没有必要。

【讨论】：