【问题标题】:Reading XML string into a list将 XML 字符串读入列表
【发布时间】:2016-02-28 06:01:12
【问题描述】:

我有一个字符串 xml

<message code="L1" />
<message code="D1" />
<message code="A1">NAME: JON              ID: 99017   CODE: 111222333    TYPE: ST</message>
<message code="A2">NTC:           RISK:               START: 09/01/2015     STATUS: ACTIVE</message>
<message code="CD">STATE: MS     LAST CANCEL REASON:</message>
<message code="A4">A, TIM                   (PRIMARY)      OS      09/01/2015    09/01/2016</message>
<message code="D1" />
<message code="A1">NAME: Tim              ID: 99017   CODE: 111222333    TYPE: ST</message>
<message code="A2">NTC:           RISK:               START: 09/01/2015     STATUS: EXPIRED</message>
<message code="CD">STATE: MS     LAST CANCEL REASON:</message>
<message code="A4">A, TIM                   (PRIMARY)      OS      09/01/2014    09/01/2015</message>               
<message code="D1" />

我想把这个字符串 xml 读入一个列表。如果你看到这个 xml,它包含 2 个部分

<message code="A1">NAME: JON              ID: 99017   CODE: 111222333    TYPE: ST</message>
<message code="A2">NTC:           RISK:               START: 09/01/2015     STATUS: ACTIVE</message>
<message code="CD">STATE: MS     LAST CANCEL REASON:</message>
<message code="A4">A, TIM                   (PRIMARY)      OS      09/01/2015    09/01/2016</message>
<message code="D1" />

<message code="A1">NAME: Tim              ID: 99017   CODE: 111222333    TYPE: ST</message>
<message code="A2">NTC:           RISK:               START: 09/01/2015     STATUS: EXPIRED</message>
<message code="CD">STATE: MS     LAST CANCEL REASON:</message>
<message code="A4">A, TIM                   (PRIMARY)      OS      09/01/2014    09/01/2015</message>               
<message code="D1" />

我想将元素放入列表中

var subjects= new List<subject>();
subjects.Add(new subject()
{
  Name = JON,
  State = MS
 })

我在 xmlnode 上通过 foreach 尝试,然后使用子字符串来获取值。

【问题讨论】:

  • 请以minimal reproducible example 的形式确切地显示您尝试了什么以及出了什么问题。
  • 我尝试在 xmlnode 上使用 foreach,然后使用子字符串来获取值。 嗯,这是一个好方法,对你有好处,你这样做有什么问题吗?
  • 这不是一个有效的 Xml,因为您似乎有多个根元素。您需要将每一行视为单独的 Xml 文档,或者使用 XmlReader 并将 XmlReaderSettings.ConformanceLevel 设置为 ConformanceLevel.Fragment;然后进行相应的后期处理。
  • 我发布了 xml 字符串的一部分,因此它不是有效的 xml。我正在使用子字符串来获取值。所以对于那个硬编码的开始和结束索引将在那里。有什么办法可以避免吗?
  • @user1893874 为什么会有硬编码的开始和结束索引?

标签: c# xml


【解决方案1】:

试试正则表达式。我使用 D1 开始每个主题,但您可能需要忽略 D1 并使用 A1

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Dictionary<string, string>> subjects = new List<Dictionary<string, string>>();
            string xml =
                "<message code=\"L1\" />\n" +
                "<message code=\"D1\" />\n" +
                "<message code=\"A1\">NAME: JON              ID: 99017   CODE: 111222333    TYPE: ST</message>\n" +
                "<message code=\"A2\">NTC:           RISK:               START: 09/01/2015     STATUS: ACTIVE</message>\n" +
                "<message code=\"CD\">STATE: MS     LAST CANCEL REASON:</message>\n" +
                "<message code=\"A4\">A, TIM                   (PRIMARY)      OS      09/01/2015    09/01/2016</message>\n" +
                "<message code=\"D1\" />\n" +
                "<message code=\"A1\">NAME: Tim              ID: 99017   CODE: 111222333    TYPE: ST</message>\n" +
                "<message code=\"A2\">NTC:           RISK:               START: 09/01/2015     STATUS: EXPIRED</message>\n" +
                "<message code=\"CD\">STATE: MS     LAST CANCEL REASON:</message>\n" +
                "<message code=\"A4\">A, TIM                   (PRIMARY)      OS      09/01/2014    09/01/2015</message>\n" +
                "<message code=\"D1\" />\n";

            string pattern1 = "<message code=\"(?'code'[^\"]*)\"(>(?'innertext'[^<]*))?";
            string pattern2 = @"((?'name'[^:]*):\s?(?'value'[\w0-9/\<\>]*)?)";
            StringReader reader = new StringReader(xml);
            string inputLine = "";

            Dictionary<string, string> subject = null;
            while((inputLine = reader.ReadLine()) != null)
            {
                Match match1 = Regex.Match(inputLine, pattern1);
                string code = match1.Groups["code"].Value;
                string innertext = match1.Groups["innertext"].Value;

                if (code == "D1")
                {
                    subject = new Dictionary<string, string>();
                    subjects.Add(subject);
                }
                else
                {
                    if (innertext.Length > 0)
                    {
                        MatchCollection matches = Regex.Matches(innertext, pattern2);
                        foreach (Match match2 in matches)
                        {
                            string name = match2.Groups["name"].Value.Trim();
                            string value = match2.Groups["value"].Value.Trim();
                            subject.Add(name, value);
                        }
                    }
                }
            }

        }
    }
}

这是同时使用 XML 和 Regex 的第二种方法

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Dictionary<string, string>> subjects = new List<Dictionary<string, string>>();
            string xml =
                "<message code=\"L1\" />\n" +
                "<message code=\"D1\" />\n" +
                "<message code=\"A1\">NAME: JON              ID: 99017   CODE: 111222333    TYPE: ST</message>\n" +
                "<message code=\"A2\">NTC:           RISK:               START: 09/01/2015     STATUS: ACTIVE</message>\n" +
                "<message code=\"CD\">STATE: MS     LAST CANCEL REASON:</message>\n" +
                "<message code=\"A4\">A, TIM                   (PRIMARY)      OS      09/01/2015    09/01/2016</message>\n" +
                "<message code=\"D1\" />\n" +
                "<message code=\"A1\">NAME: Tim              ID: 99017   CODE: 111222333    TYPE: ST</message>\n" +
                "<message code=\"A2\">NTC:           RISK:               START: 09/01/2015     STATUS: EXPIRED</message>\n" +
                "<message code=\"CD\">STATE: MS     LAST CANCEL REASON:</message>\n" +
                "<message code=\"A4\">A, TIM                   (PRIMARY)      OS      09/01/2014    09/01/2015</message>\n" +
                "<message code=\"D1\" />\n";

            XmlReaderSettings settings = new XmlReaderSettings();
            settings.ConformanceLevel = ConformanceLevel.Fragment;
            StringReader reader = new StringReader(xml);
            XmlReader xReader = XmlReader.Create(reader, settings);

            string pattern = @"((?'name'[^:]*):\s?(?'value'[\w0-9/\<\>]*)?)";

            Dictionary<string, string> subject = null;
            while (!xReader.EOF)
            {
                if (xReader.Name != "message")
                {
                    xReader.ReadToFollowing("message");
                }
                if (!xReader.EOF)
                {
                    XElement message = (XElement)XElement.ReadFrom(xReader);
                    string code = (string)message.Attribute("code");
                    if (code == "D1")
                    {
                        subject = new Dictionary<string, string>();
                        subjects.Add(subject);
                    }
                    else
                    {
                        string innertext = (string)message;
                        if (innertext.Length > 0)
                        {
                            MatchCollection matches = Regex.Matches(innertext, pattern);
                            foreach (Match match2 in matches)
                            {
                                string name = match2.Groups["name"].Value.Trim();
                                string value = match2.Groups["value"].Value.Trim();
                                subject.Add(name, value);
                            }
                        }
                    }

                }
            }
        }
    }
}

【讨论】:

    猜你喜欢
    • 2016-12-03
    • 2020-05-14
    • 2020-08-04
    • 1970-01-01
    • 1970-01-01
    • 2012-03-19
    • 1970-01-01
    • 1970-01-01
    • 2021-05-11
    相关资源
    最近更新 更多