【发布时间】:2011-06-24 08:50:12
【问题描述】:
我有这段代码来提取 html 文档中的所有表单输入元素。目前,除了输入元素,我无法获取 select、textarea 和其他元素。
Dim htmldoc As HtmlDocument = New HtmlDocument()
htmldoc.LoadHtml(txtHtml.Text)
Dim root As HtmlNode = htmldoc.DocumentNode
If root Is Nothing Then
tsslStatus.Text = "Error parsing html"
End If
' parse the page content
For Each InputTag As HtmlNode In root.SelectNodes("//input")
'get title
Dim attName As String = Nothing
Dim attType As String = Nothing
For Each att As HtmlAttribute In InputTag.Attributes
Select Case att.Name.ToLower
Case "name"
attName = att.Value
Case "type"
attType = att.Value
End Select
If attName Is Nothing OrElse attType Is Nothing Then
Continue For
End If
Dim sResult As String = String.Format("Type={0},Name={1}", attType, attName).ToLower
If txtResult.Text.Contains(sResult) = False Then
'Debug.Print(sResult)
txtResult.Text &= sResult & vbCrLf
End If
Next
Next
谁能帮助我了解如何获取 html 文档中所有形式的所有元素?
【问题讨论】:
标签: vb.net winforms forms html-agility-pack