【问题标题】:regex to match <Key>....<Value> pattern正则表达式匹配 <Key>....<Value> 模式
【发布时间】:2010-07-06 07:36:02
【问题描述】:

我有以下外部系统发送的数据,需要针对特定​​密钥进行解析

<ContextDetails>
<Context><Key>ID</Key><Value>100</Value></Context>
<Context><Key>Name</Key><Value>MyName</Value></Context>
</ContextDetails>

我尝试使用正则表达式解析它以获取 KEY 的值:名称

&lt;Context&gt;&lt;Key&gt;Name&lt;/Key&gt;&lt;Value&gt;.&lt;/Value&gt;&lt;/Context&gt;

但结果是空白

修复这个正则表达式我需要做些什么改变

【问题讨论】:

  • 你不应该为此使用正则表达式..
  • 对我来说这看起来不像正则表达式 - 你使用什么语言来表示正则表达式?爪哇? 。网? Javascript?珀尔?红宝石?还有什么?
  • 看起来是 XML 解析器的完美工作。

标签: .net xml regex parsing


【解决方案1】:

如果这是 XML,请将其加载到 XDocument 并查询。

请参阅@Jens 的answer,了解有关如何执行此操作的详细信息。

【讨论】:

    【解决方案2】:

    要扩展Oded's answer,你应该这样做的方式是这样的:

    XDocument doc = XDocument.Parse(@"<ContextDetails> 
    <Context><Key>ID</Key><Value>100</Value></Context> 
    <Context><Key>Name</Key><Value>MyName</Value></Context> 
    </ContextDetails>");
    
    String name  =  doc.Root.Elements("Context")
                            .Where(xe => xe.Element("Key").Value == "Name")
                            .Single()
                            .Element("Value").Value;
    

    【讨论】:

      【解决方案3】:

      在我看来你做错了。您应该使用 XML 解析器。 http://www.tutorialspoint.com/ruby/ruby_xml_xslt.htm 这只是一个指南。它可以提供帮助。

      【讨论】:

        【解决方案4】:

        我认为,匹配所有 Key-Value-Pairse 的 Reg-Ex 表达式是:

        <Context>\s*?<Key>(.*?)\</Key>\s*?<Value>(.*?)</Value>\s*?</Context>
        

        说明:

        // <Context>\s*?<Key>(.*?)\</Key>\s*?<Value>(.*?)</Value>\s*?</Context>
        // 
        // Match the characters "<Context>" literally «<Context>»
        // Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
        //    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "<Key>" literally «<Key>»
        // Match the regular expression below and capture its match into backreference number 1 «(.*?)»
        //    Match any single character that is not a line break character «.*?»
        //       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the character "<" literally «\<»
        // Match the characters "/Key>" literally «/Key>»
        // Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
        //    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "<Value>" literally «<Value>»
        // Match the regular expression below and capture its match into backreference number 2 «(.*?)»
        //    Match any single character that is not a line break character «.*?»
        //       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "</Value>" literally «</Value>»
        // Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
        //    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "</Context>" literally «</Context>»
        

        用法:

        using System.Text.RegularExpressions;
        public static void RunSnippet()
            {
                Regex RegexObj = new Regex("<Context>\\s*?<Key>(.*?)\\</Key>\\s*?<Value>(.*?)</Value>\\s*?</Context>",
                    RegexOptions.IgnoreCase | RegexOptions.Multiline);
                Match MatchResults = RegexObj.Match(@"<ContextDetails>
                    <Context><Key>ID</Key><Value>100</Value></Context>
                    <Context><Key>Name</Key>   <Value>MyName</Value></Context>
                    </ContextDetails>
                    ");
                while (MatchResults.Success){
                    Console.WriteLine("Key: " + MatchResults.Groups[1].Value)   ;
                    Console.WriteLine("Value: " + MatchResults.Groups[2].Value) ;
                    Console.WriteLine("----");
                    MatchResults = MatchResults.NextMatch();
                }
            }
            /*
            Output:
        
                Key: ID
                Value: 100
                ----
                Key: Name
                Value: MyName
                ----
            */
        

        仅计算值或键“名称”的正则表达式:

        <Context>\s*?<Key>Name</Key>\s*?<Value>(.*?)</Value>\s*?</Context>
        

        说明:

        // <Context>\s*?<Key>Name</Key>\s*?<Value>(.*?)</Value>\s*?</Context>
        // 
        // Match the characters "<Context>" literally «<Context>»
        // Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
        //    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "<Key>Name</Key>" literally «<Key>Name</Key>»
        // Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
        //    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "<Value>" literally «<Value>»
        // Match the regular expression below and capture its match into backreference number 1 «(.*?)»
        //    Match any single character that is not a line break character «.*?»
        //       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "</Value>" literally «</Value>»
        // Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
        //    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
        // Match the characters "</Context>" literally «</Context>»
        

        用法:

        string SubjectString = @"<ContextDetails>
                    <Context><Key>ID</Key><Value>100</Value></Context>
                    <Context><Key>Name</Key>   <Value>MyName</Value></Context>
                    </ContextDetails>
                    ";
            Console.WriteLine( Regex.Match(SubjectString, "<Context>\\s*?<Key>Name</Key>\\s*?<Value>(.*?)</Value>\\s*?</Context>",
                    RegexOptions.IgnoreCase | RegexOptions.Multiline).Groups[1].Value );
        

        【讨论】:

        • 哇,这是一个解释! =) 请问您是否使用了一些生成器来为您执行此操作?那会派上用场的!
        • RegExBuddy 是解释的生成器。它是一个带有调试器的正则表达式编辑器。 (网址:regexbuddy.com
        【解决方案5】:

        您可以使用 XML 解析器吗?如果是这样,那就使用它,它是适合这项工作的工具。

        如果您只有一个文本编辑器,并且愿意手动检查每个匹配项,那么您可以使用正则表达式。您的正则表达式中的错误是 . 仅匹配一个字符(除换行符之外的任何字符)。因此,您需要将其替换为 .*?(匹配任意数量的字符,但尽可能少),或者更好的是 [^&lt;]*

        后者表示“除&lt; 之外的零个或多个字符”(这是分隔符)。当然,这只有在您要查找的值中没有 &lt; 时才有效。

        您的正则表达式还假设整个匹配在一行上,标签之间没有空格 - 因此在所有其他情况下都会失败。

        更新:我刚刚看到您的编辑:那么您确实可以访问 XML 解析器 - 使用 Oded 的答案。

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2017-06-06
          相关资源
          最近更新 更多