【问题标题】:Injecting HTML at specific location using HTMLAgilityPack使用 HTMLAgilityPack 在特定位置注入 HTML
【发布时间】:2018-10-03 02:11:30
【问题描述】:

有人要求我将一堆 HTML 注入 HTML 文档中的特定点,并且一直在考虑使用 HTMLAgilityPack 来执行此操作。 据我所知,推荐的方法是使用节点解析并替换/删除相关节点。

这是我目前的代码

//Load original HTML
var originalHtml = new HtmlDocument();
originalHtml.Load(@"C:\Temp\test.html");

//Load inject HTML
var inject = new HtmlDocument();
inject.Load(@"C:\Temp\Temp\inject.html");
var injectNode = HtmlNode.CreateNode(inject.Text);

//Get all HTML nodes to inject/delete
var nodesToDelete = originalHtml.DocumentNode.SelectNodes("//p[@style='page-break-after:avoid']");
var countToDelete = nodesToDelete.Count();

//loop through stuff to remove
int count = 0;
foreach (var nodeToDelete in nodesToDelete)
{
    count++;
    if (count == 1)
    {
        //replace with inject HTML
        nodeToDelete.ParentNode.ReplaceChild(injectNode, nodeToDelete);
    }
    else if (count <= countToDelete)
    {
        //remove, as HTML already injected
        nodeToDelete.ParentNode.RemoveChild(nodeToDelete);
    }
}

我发现,原来的 HTML 没有正确更新,它看起来好像只注入了父级节点,这是一个简单的,没有子节点。

有什么帮助吗??

谢谢,

帕特里克。

【问题讨论】:

    标签: c# html html-agility-pack


    【解决方案1】:

    好吧,我不知道如何使用 HTMLAgilityPack 来解决这个问题,这可能更多是因为我对节点的了解比其他任何事情都多,但我确实找到了使用 AngleSharp 的简单解决方法。

    //Load original HTML into document
    var parser = new HtmlParser();
    var htmlDocument = parser.Parse(File.ReadAllText(@"C:\Temp\test.html"));
    
    //Load inject HTML as raw text
    var injectHtml = File.ReadAllText(@"C:\Temp\inject.html")
    
    //Get all HTML elements to inject/delete
    var elements = htmlDocument.All.Where(e => e.Attributes.Any(a => a.Name == "style" && a.Value == "page-break-after:avoid"));
    
    //loop through stuff to remove
    int count = 1;
    foreach (var element  in elements)
    {
        if (count == 1)
        {
            //replace with inject HTML
            element.OuterHtml = injectHtml;
        }
        else
        {
            //remove, as HTML already injected
            element.Remove();
        }
        count++;
    }
    
    
    //Re-write updated file
    File.WriteAllText(@"C:\Temp\test_updated.html", string.Format("{0}{1}{2}{3}","<html>",htmlDocument.Head.OuterHtml,htmlDocument.Body.OuterHtml,"</html>"));
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-02-03
      • 1970-01-01
      • 2019-10-12
      • 2012-01-28
      • 2014-01-20
      相关资源
      最近更新 更多