C# 正则表达式更改每个匹配项的替换字符串答案

【问题标题】：C# Regex change replacement string for each matchC# 正则表达式更改每个匹配项的替换字符串
【发布时间】：2017-04-17 16:48:15
【问题描述】：

我有一个这样的字符串：

string s = "<p>Hello world, hello world</p>";
string[] terms = new string[] {"hello", "world"};

我想对此字符串进行替换，以便匹配每个单词（不区分大小写），并替换为带编号的索引跨度标记，如下所示：

<p>
    <span id="m_1">Hello</span> 
    <span id="m_2">world</span>, 
    <span id="m_3">hello</span> 
    <span id="m_4">world</span>!
</p>

我试过这样做。

int match = 1;
Regex.Replace(s,
    String.Join("|", String.Join("|", terms.OrderByDescending(s => s.Length)
        .Select(Regex.Escape))),
    String.Format("<span id=\"m_{0}\">$&</span>", match++),
    RegexOptions.IgnoreCase);

输出是这样的：

<p>
    <span id="m_1">Hello</span> 
    <span id="m_1">world</span>, 
    <span id="m_1">hello</span> 
    <span id="m_1">world</span>!
</p>

所有 id 都相同 (m_1)，因为正则表达式不会为每个匹配计算 match++，而是为整个正则表达式计算一个。我该如何解决这个问题？

【问题讨论】：

可能更容易解析html和迭代span节点，看看：stackoverflow.com/questions/6063203/parsing-html-with-c-net
必须是正则表达式吗？看起来带有比较的循环将是一种更简单且更具可读性的方法。
@ferflores 我正在解析它，但输入没有跨度节点。那是期望的输出和实际的输出。输入是上面的那个字符串。

标签： c# regex

【解决方案1】：

您需要做的就是将替换参数从字符串模式转换为匹配评估器 (m => String.Format("<span id=\"m_{0}\">{1}</span>", match++, m.Value))：

string s1 = "<p>Hello world, hello world</p>";
string[] terms = new string[] {"hello", "world"};
var match = 1;
s1 = Regex.Replace(s1,
        String.Join("|", String.Join("|", terms.OrderByDescending(s => s.Length)
            .Select(Regex.Escape))),
    m => String.Format("<span id=\"m_{0}\">{1}</span>", match++, m.Value),
    RegexOptions.IgnoreCase);
Console.Write(s1);
// => <p><span id="m_1">Hello</span> <span id="m_2">world</span>, <span id="m_3">hello</span> <span id="m_4">world</span></p>

见C# demo

【讨论】：