【发布时间】:2010-12-15 14:43:44
【问题描述】:
我需要能够在 2 个标签之间提取字符串,例如:“00002”来自“morenonxmldata<tag1>0002</tag1>morenonxmldata”
我正在使用 C# 和 .NET 3.5。
【问题讨论】:
我需要能够在 2 个标签之间提取字符串,例如:“00002”来自“morenonxmldata<tag1>0002</tag1>morenonxmldata”
我正在使用 C# 和 .NET 3.5。
【问题讨论】:
Regex regex = new Regex("<tag1>(.*)</tag1>");
var v = regex.Match("morenonxmldata<tag1>0002</tag1>morenonxmldata");
string s = v.Groups[1].ToString();
或者(如 cmets 中所述)匹配最小子集:
Regex regex = new Regex("<tag1>(.*?)</tag1>");
Regex 类位于 System.Text.RegularExpressions 命名空间中。
【讨论】:
(.*) 更改为(.*?) 来使用非贪婪匹配 - 这将防止@Kugel 提到的不正确匹配。
不需要正则表达式的解决方案:
string ExtractString(string s, string tag) {
// You should check for errors in real-world code, omitted for brevity
var startTag = "<" + tag + ">";
int startIndex = s.IndexOf(startTag) + startTag.Length;
int endIndex = s.IndexOf("</" + tag + ">", startIndex);
return s.Substring(startIndex, endIndex - startIndex);
}
【讨论】:
使用惰性匹配和反向引用的Regex 方法:
foreach (Match match in Regex.Matches(
"morenonxmldata<tag1>0002</tag1>morenonxmldata<tag2>abc</tag2>asd",
@"<([^>]+)>(.*?)</\1>"))
{
Console.WriteLine("{0}={1}",
match.Groups[1].Value,
match.Groups[2].Value);
}
【讨论】:
在两个已知值之间提取内容对于以后也很有用。那么为什么不为它创建一个扩展方法。这就是我所做的,简短而简单...
public static string GetBetween(this string content, string startString, string endString)
{
int Start=0, End=0;
if (content.Contains(startString) && content.Contains(endString))
{
Start = content.IndexOf(startString, 0) + startString.Length;
End = content.IndexOf(endString, Start);
return content.Substring(Start, End - Start);
}
else
return string.Empty;
}
【讨论】:
string input = "Exemple of value between two string FirstString text I want to keep SecondString end of my string";
var match = Regex.Match(input, @"FirstString (.+?) SecondString ").Groups[1].Value;
【讨论】:
为了将来参考,我在http://www.mycsharpcorner.com/Post.aspx?postID=15 找到了这段代码 sn-p 如果您需要搜索不同的“标签”,它非常有效。
public static string[] GetStringInBetween(string strBegin,
string strEnd, string strSource,
bool includeBegin, bool includeEnd)
{
string[] result ={ "", "" };
int iIndexOfBegin = strSource.IndexOf(strBegin);
if (iIndexOfBegin != -1)
{
// include the Begin string if desired
if (includeBegin)
iIndexOfBegin -= strBegin.Length;
strSource = strSource.Substring(iIndexOfBegin
+ strBegin.Length);
int iEnd = strSource.IndexOf(strEnd);
if (iEnd != -1)
{
// include the End string if desired
if (includeEnd)
iEnd += strEnd.Length;
result[0] = strSource.Substring(0, iEnd);
// advance beyond this segment
if (iEnd + strEnd.Length < strSource.Length)
result[1] = strSource.Substring(iEnd
+ strEnd.Length);
}
}
else
// stay where we are
result[1] = strSource;
return result;
}
【讨论】:
不使用正则表达式获取单个/多个值
// For Single
var value = inputString.Split("<tag1>", '</tag1>')[1];
// For Multiple
var values = inputString.Split("<tag1>", '</tag1>').Where((_, index) => index % 2 != 0);
【讨论】:
我在数据前后剥离。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
namespace testApp
{
class Program
{
static void Main(string[] args)
{
string tempString = "morenonxmldata<tag1>0002</tag1>morenonxmldata";
tempString = Regex.Replace(tempString, "[\\s\\S]*<tag1>", "");//removes all leading data
tempString = Regex.Replace(tempString, "</tag1>[\\s\\S]*", "");//removes all trailing data
Console.WriteLine(tempString);
Console.ReadLine();
}
}
}
【讨论】:
没有 RegEx,有一些必备的值检查
public static string ExtractString(string soapMessage, string tag)
{
if (string.IsNullOrEmpty(soapMessage))
return soapMessage;
var startTag = "<" + tag + ">";
int startIndex = soapMessage.IndexOf(startTag);
startIndex = startIndex == -1 ? 0 : startIndex + startTag.Length;
int endIndex = soapMessage.IndexOf("</" + tag + ">", startIndex);
endIndex = endIndex > soapMessage.Length || endIndex == -1 ? soapMessage.Length : endIndex;
return soapMessage.Substring(startIndex, endIndex - startIndex);
}
【讨论】:
public string between2finer(string line, string delimiterFirst, string delimiterLast)
{
string[] splitterFirst = new string[] { delimiterFirst };
string[] splitterLast = new string[] { delimiterLast };
string[] splitRes;
string buildBuffer;
splitRes = line.Split(splitterFirst, 100000, System.StringSplitOptions.RemoveEmptyEntries);
buildBuffer = splitRes[1];
splitRes = buildBuffer.Split(splitterLast, 100000, System.StringSplitOptions.RemoveEmptyEntries);
return splitRes[0];
}
private void button1_Click(object sender, EventArgs e)
{
string manyLines = "Received: from exim by isp2.ihc.ru with local (Exim 4.77) \nX-Failed-Recipients: rmnokixm@gmail.com\nFrom: Mail Delivery System <Mailer-Daemon@isp2.ihc.ru>";
MessageBox.Show(between2finer(manyLines, "X-Failed-Recipients: ", "\n"));
}
【讨论】: