【问题标题】:Replacing opening and closing parenthesis of a certain strucure?替换某个结构的左括号和右括号?
【发布时间】:2017-12-07 16:02:03
【问题描述】:

我正在尝试将某个标签内的括号替换为该标签的外部,即,如果在标签之后有一个左括号,或者在结束标签之前有一个右括号。示例:

<italic>(When a parenthetical sentence stands on its own)</italic>
<italic>(When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own)</italic>

这些行应该在替换之后:

(<italic>When a parenthetical sentence stands on its own</italic>)
(<italic>When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own</italic>)

但是,下面三个字符串应该保持不变。

<italic>(When) a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its (own)</italic>
<italic>When a parenthetical sentence stands (on) its own</italic>

但以下字符串:

<italic>((When) a parenthetical sentence stands on its own</italic>
<italic>((When) a parenthetical sentence stands on its own)</italic>
<italic>(When) a parenthetical sentence stands on its own)</italic>
<italic>When a parenthetical sentence stands on its (own))</italic>
<italic>(When a parenthetical sentence stands on its (own)</italic>

应该在替换之后:

(<italic>(When) a parenthetical sentence stands on its own</italic>
(<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>When a parenthetical sentence stands on its (own)</italic>)
(<italic>When a parenthetical sentence stands on its (own)</italic>

&lt;italic&gt;...&lt;/italic&gt; 标签内可能有嵌套标签,并且一行可以包含多个&lt;italic&gt;...&lt;/italic&gt; 字符串。 此外,如果&lt;italic&gt;...&lt;/italic&gt; 内有嵌套标签&lt;inline-formula&gt;...&lt;/inline-formula&gt;,则应忽略这些标签。

我可以使用正则表达式吗?如果不是,我还能通过什么其他方式做到这一点?

我的方法是这样的(我仍然不确定它是否涵盖了所有可能的情况):

第一步:&lt;italic&gt;( ---&gt; (&lt;italic&gt; 查找&lt;italic&gt;( 如果标签后面没有匹配的括号对,则立即没有结束标签 匹配只允许在一行内。

查找内容:(&lt;(italic)&gt;)(?!(\((?&gt;(?:(?![()\r\n]).)++|(?3))*+\))(?!&lt;/$2\b))(\() 替换为:$4$1

第二步:)&lt;/italic&gt; ---&gt; &lt;/italic&gt;) 如果标签前面没有匹配的括号对,则查找)&lt;/italic&gt; 匹配只允许在一行内。

(\))(?&lt;!(?&lt;!&lt;(italic)&gt;)(\((?&gt;(?:(?![()\r\n]).)++|(?3))*+\)))(&lt;/2\b&gt;)

【问题讨论】:

  • 你试过什么?请给我们一些代码。
  • @nilsK 检查更新的问题
  • (&lt;italic&gt;(When a parenthetical sentence stands on its own)&lt;/italic&gt;) 怎么样。我们是否因为外面已经有括号而跳过替换?
  • @Flater (&lt;italic&gt;(When a parenthetical sentence stands on its own)&lt;/italic&gt;) 应替换为 ((&lt;italic&gt;When a parenthetical sentence stands on its own&lt;/italic&gt;))
  • 在走这条路之前请务必阅读stackoverflow.com/questions/1732348/…

标签: c# regex


【解决方案1】:

您可以通过几种不同的方式来做到这一点,我会从定义标签何时可替换开始。

  1. 如果标签中的文本以 ( 开头并且在结束标签之前关闭或者未关闭,我们可以替换开始标签
  2. 如果标签中的文本以 ) 结尾并且在开始标签之后立即打开,或者未打开,我们可以替换结束标签

这个问题似乎适用于解析器方法并跟踪括号状态(标记文本的开头是否有括号,以及当前点的括号是如何嵌套的)。编写解析器可以让我们以建设性的方式进行替换,而不是使用正则表达式进行搜索,替换子字符串,并且自然是递归的,可以处理嵌套。使用正则表达式执行此操作似乎有点令人费解。这是我想出的。

using System;
using System.IO;
using System.Text;

namespace ParenParser {
    public class Program
    {
        public static Stream GenerateStreamFromString(string s)
        {
            MemoryStream stream = new MemoryStream();
            StreamWriter writer = new StreamWriter(stream);
            writer.Write(s);
            writer.Flush();
            stream.Position = 0;
            return stream;
        }

        public static String Process(StreamReader s) { // root
            StringBuilder output = new StringBuilder();
            while (!s.EndOfStream) {
                var ch = Convert.ToChar(s.Read());
                if (ch == '<') {
                    output.Append(ProcessTag(s, true));
                } else {
                    output.Append(ch);
                }
            }

            return output.ToString();
        }

        public static String ProcessTag(StreamReader s, bool skipOpeningBracket = true) {
            int currentParenDepth = 0;
            StringBuilder openingTag = new StringBuilder(), allTagText = new StringBuilder(), closingTag = new StringBuilder();
            bool inOpeningTag = false, inClosingTag = false;
            if (skipOpeningBracket) {
                inOpeningTag = true;
                openingTag.Append('<');
                skipOpeningBracket = false;
            }

            while (!s.EndOfStream) {
                var ch = Convert.ToChar(s.Read());
                if (ch == '<') { // start of a tag
                    var nextCh = Convert.ToChar(s.Peek());
                    if (nextCh == '/') { // closing tag!
                        closingTag.Append(ch);
                        inClosingTag = true;
                    } else if (openingTag.ToString().Length != 0) { // already seen a tag, recurse
                        allTagText.Append(ProcessTag(s, true));
                        continue;
                    } else {
                        openingTag.Append(ch);
                        inOpeningTag = true;
                    }
                }
                else if (inOpeningTag) {
                    openingTag.Append(ch);
                    if (ch == '>') {
                        inOpeningTag = false;
                    }
                }
                else if (inClosingTag) {
                    closingTag.Append(ch);
                    if (ch == '>') {
                        // Done!
                        var allTagTextString = allTagText.ToString();
                        if (allTagTextString.Length > 0 && allTagTextString[0] == '(' && allTagTextString[allTagTextString.Length - 1] == ')' && currentParenDepth == 0) {
                            return "(" + openingTag.ToString() + allTagTextString.Substring(1, allTagTextString.Length - 2) + closingTag.ToString() + ")";
                        } else if (allTagTextString.Length > 0 && allTagTextString[0] == '(' && currentParenDepth > 0) { // unclosed
                            return "(" + openingTag.ToString() + allTagTextString.Substring(1, allTagTextString.Length - 1) + closingTag.ToString();
                        } else if (allTagTextString.Length > 0 && allTagTextString[allTagTextString.Length - 1] == ')' && currentParenDepth < 0) { // unopened
                            return openingTag.ToString() + allTagTextString.Substring(0, allTagTextString.Length - 1) + closingTag.ToString() + ")";
                        } else {
                            return openingTag.ToString() + allTagTextString + closingTag.ToString();
                        }
                    }
                }
                else
                {
                    allTagText.Append(ch);
                    if (ch == '(') {
                        currentParenDepth++;
                    }
                    else if (ch == ')') {
                        currentParenDepth--;
                    }
                }
            }

            return openingTag.ToString() + allTagText.ToString() + closingTag.ToString();
        }

        public static void Main()
        {
            var testCases = new String[] {
                // Should change
                "<italic>(When a parenthetical sentence stands on its own)</italic>",
                "<italic>(When a parenthetical sentence stands on its own</italic>",
                "<italic>When a parenthetical sentence stands on its own)</italic>",

                // Should remain unchanged
                "<italic>(When) a parenthetical sentence stands on its own</italic>",
                "<italic>When a parenthetical sentence stands on its (own)</italic>",
                "<italic>When a parenthetical sentence stands (on) its own</italic>",

                // Should be changed
                "<italic>((When) a parenthetical sentence stands on its own</italic>",
                "<italic>((When) a parenthetical sentence stands on its own)</italic>",
                "<italic>(When) a parenthetical sentence stands on its own)</italic>",
                "<italic>When a parenthetical sentence stands on its (own))</italic>",
                "<italic>(When a parenthetical sentence stands on its (own)</italic>",

                // Other cases
                "<italic>(Try This on!)</italic>",
                "<italic><italic>(Try This on!)</italic></italic>",
                "<italic></italic>",
                "",
                "()",
                "<italic>()</italic>",
                "<italic>"
            };

            foreach(var testCase in testCases) {
                using(var testCaseStreamReader = new StreamReader(GenerateStreamFromString(testCase))) {
                    Console.WriteLine(testCase + " --> " + Process(testCaseStreamReader));
                }
            }
        }
    }
}

测试用例结果类似于

<italic>(When a parenthetical sentence stands on its own</italic> --> (<italic>When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own)</italic> --> <italic>When a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own</italic> --> <italic>(When) a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its (own)</italic> --> <italic>When a parenthetical sentence stands on its (own)</italic>
<italic>When a parenthetical sentence stands (on) its own</italic> --> <italic>When a parenthetical sentence stands (on) its own</italic>
<italic>((When) a parenthetical sentence stands on its own</italic> --> (<italic>(When) a parenthetical sentence stands on its own</italic>
<italic>((When) a parenthetical sentence stands on its own)</italic> --> (<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own)</italic> --> <italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>When a parenthetical sentence stands on its (own))</italic> --> <italic>When a parenthetical sentence stands on its (own)</italic>)
<italic>(When a parenthetical sentence stands on its (own)</italic> --> (<italic>When a parenthetical sentence stands on its (own)</italic>
<italic>(Try This on!)</italic> --> (<italic>Try This on!</italic>)
<italic><italic>(Try This on!)</italic></italic> --> (<italic><italic>Try This on!</italic></italic>)
<italic></italic> --> <italic></italic>
 --> 
() --> ()
<italic>()</italic> --> (<italic></italic>)
<italic> --> <italic>

【讨论】:

  • 我在 while (!s.) Identifier expected 线上遇到错误
  • @Don_B 应该是 while (!s.EndOfStream),不确定它是如何被切断的,已更新
  • 如何在路径中的文件中执行此操作,即如何组合以下代码string path=@"D:\Test"; string[] files=Directory.GetFiles(path,"*.xml"); foreach (var file in files) { string testCases=File.ReadAllText(file); } foreach(var testCase in testCases) { using(var testCaseStreamReader = new StreamReader(GenerateStreamFromString(testCase))) { Process(testCaseStreamReader); } }
  • 您可以围绕 FileStream 构造一个 StreamReader,File.Open(... 从给定路径返回一个 FileStream。
  • 你介意更新你的代码来做到这一点吗,我不太擅长使用Streams....
猜你喜欢
  • 2019-02-03
  • 1970-01-01
  • 2016-08-26
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多