【问题标题】:Need Example HtmlAgilityPack需要示例 HtmlAgilityPack
【发布时间】:2019-09-29 09:49:42
【问题描述】:

我再次尝试抓取作为示例。

其实我有以下代码:

Imports System
Imports System.Xml
Imports HtmlAgilityPack
Imports System.Net
Imports System.IO
Imports System.Collections.Generic


Public Class Program
    Public Shared Sub Main()
        'Enable SSL Suppport'
        ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12
        'WebPage to Scraping'
        Dim link As String = "https://www.nextinpact.com"
        'download page from the link into an HtmlDocument'
        Dim doc As HtmlDocument = New HtmlWeb().Load(link)
        'select the title'

        Dim div As HtmlNode = doc.DocumentNode.SelectSingleNode("//section[@class='small_article_section']")

        If Not div Is Nothing Then
            For Each node As HtmlNode In doc.DocumentNode.SelectNodes("//h2[@class='color_title']//a[@class='ui-link'][contains(text())]")
                Console.Write(div.InnerText.Trim())
            Next
        End If
    End Sub
End Class

实际上我试图从

中获取所有标题

"//section[@class='small_article_section']"

但是我怎么做才能得到所有的标题呢? 对于第一个标题,xpath 是

"//h2[@class='color_title']//a[@class='ui-link'][contains(text(),'Les Netflix passernt d')]"的义务

谢谢。

编辑: 我尝试另一个例子,

Dim doc As HtmlDocument = New HtmlWeb().Load("https://www.sideshow.com/collectibles?manufacturer=sideshow+collectibles&type=premium+format%28tm%29+figure&brand=aspen")
Dim div As HtmlNode = doc.DocumentNode.SelectSingleNode("//div[@class='c-ProductList row']")

现在我尝试为每个产品获取标题,使用:

For Each node As HtmlNode In div.SelectNodes("//h2[contains(text(),'Grace')]") 'That is for Only Grace 
        Console.Write(node.InnerText.Trim())
    Next

但是有

//h2[contains(text(),'Grace')]

我什么都没有,我想要 Gace 和 Aspen 并尝试一下

.//h2[contains(text()]

什么也没有

【问题讨论】:

    标签: vb.net html-agility-pack


    【解决方案1】:

    这就是你的做法。

        Dim doc As HtmlDocument = New HtmlWeb().Load("https://www.nextinpact.com/")
        Dim div As HtmlNode = doc.DocumentNode.SelectSingleNode("//section[@class='small_article_section']")
    
        'If div IsNot Nothing Then 'I think this part is pointless as it will always exist
        For Each node As HtmlNode In div.SelectNodes(".//h2[@class='color_title']/a") 'a class='ui-link' doesn't exist so do h2/a
            Console.Write(node.InnerText.Trim())
        Next
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2017-02-15
      • 2017-03-18
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-12-19
      • 2012-07-17
      相关资源
      最近更新 更多