【发布时间】:2012-09-12 16:32:20
【问题描述】:
我有一个这样的 html 表达式:
"This is <h4>Some</h4> Text" + Environment.NewLine +
"This is some more <h5>text</h5>
我只想提取文本。所以结果应该是
"This is Some Text" + Environment.NewLine +
"This is some more text"
我该怎么做?
【问题讨论】:
我有一个这样的 html 表达式:
"This is <h4>Some</h4> Text" + Environment.NewLine +
"This is some more <h5>text</h5>
我只想提取文本。所以结果应该是
"This is Some Text" + Environment.NewLine +
"This is some more text"
我该怎么做?
【问题讨论】:
string html = @"This is <h4>Some</h4> Text" + Environment.NewLine +
"This is some more <h5>text</h5>";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var str = doc.DocumentNode.InnerText;
【讨论】:
简单使用正则表达式:Regex.Replace(source, "<.*?>", string.Empty);
【讨论】:
<h4 title='e>Sh<opping'>it happens</h4>