【问题标题】:How do you understand regular expressions that are written in one line?你如何理解写在一行中的正则表达式?
【发布时间】:2010-08-29 20:20:18
【问题描述】:

这是一个整洁的、有据可查的正则表达式,易于理解、维护和修改。

    text = text.replace(/
    (                               // Wrap whole match in $1
        (
            ^[ \t]*>[ \t]?          // '>' at the start of a line
            .+\n                    // rest of the first line
            (.+\n)*                 // subsequent consecutive lines
            \n*                     // blanks
        )+
    )
    /gm,

但是你如何处理这些?

text = text.replace(/((^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+)/gm,

是否有某种beautifier 可以理解并描述其功能?

【问题讨论】:

  • 也许stackoverflow.com/questions/32282/regex-testing-tools 中的一些工具会有所帮助。
  • 我都不做。我总是使用 6-10 行代码和 explode/join/strstr/substr (PHP) 来代替。更易于理解、维护甚至编写。
  • 由于添加了空格,并非所有支持正则表达式的语言或库都能像您的示例一样干净利落。

标签: regex maintainability


【解决方案1】:

努力熟练地阅读单行形式的正则表达式是值得的。大部分时间都是这样写的

【讨论】:

  • 是的,就像任何编程语言的语法一样,过了一会儿它就变得可读了。
【解决方案2】:

RegexBuddy 将为您“翻译”任何正则表达式。当输入您的示例正则表达式时,它会输出:

((^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+)

Options: ^ and $ match at line breaks

Match the regular expression below and capture its match into backreference number 1 «((^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+)»
   Match the regular expression below and capture its match into backreference number 2 «(^[ \t]*>[ \t]?.+\n(.+\n)*\n*)+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Note: You repeated the capturing group itself.  The group will capture only the last iteration.  
          Put a capturing group around the repeated group to capture all iterations. «+»
      Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
      Match a single character present in the list below «[ \t]*»
         Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
         The character “ ” « »
         A tab character «\t»
      Match the character “>” literally «>»
      Match a single character present in the list below «[ \t]?»
         Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
         The character “ ” « »
         A tab character «\t»
      Match any single character that is not a line break character «.+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match a line feed character «\n»
      Match the regular expression below and capture its match into backreference number 3 «(.+\n)*»
         Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
         Note: You repeated the capturing group itself.  The group will capture only the last iteration.  
             Put a capturing group around the repeated group to capture all iterations. «*»
         Match any single character that is not a line break character «.+»
            Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
         Match a line feed character «\n»
      Match a line feed character «\n*»
         Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»

这在文本形式中看起来确实很吓人,但在 HTML 形式(无法在此处复制)或 RegexBuddy 本身中更具可读性。它还指出了常见的问题(例如,这里可能不需要重复捕获组)。

【讨论】:

  • 哇,太冗长了。但可能有助于学习正则表达式。
  • 坦率地说,我发现这些繁重的解释并不比单行正则表达式更容易理解。
【解决方案3】:

我喜欢expresso

【讨论】:

    【解决方案4】:

    一段时间后,我已经习惯了阅读这些东西。大多数正则表达式并不多,如果您想更频繁地使用它们,我推荐网站http://www.regular-expressions.info/

    【讨论】:

      【解决方案5】:

      正则表达式只是表达掩码等的一种方式。归根结底,它只是一种具有自己语法的“语言”。
      注释正则表达式的每一点与注释项目的每一行都是一样的。
      当然,它会帮助那些不理解你的代码的人,但如果你(开发者)理解正则表达式的含义,它就毫无用处。

      对我来说,阅读正则表达式与阅读代码是一回事。如果表达式真的很复杂,下面的解释可能很有用,但大多数时候没有必要。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2017-11-25
        • 1970-01-01
        • 2018-10-17
        • 1970-01-01
        相关资源
        最近更新 更多