【发布时间】:2016-11-05 12:19:36
【问题描述】:
以下代码成功循环遍历 DOM 中的每个元素,并将每个元素放入 Excel 工作表中。 (标记名、ID、类名等)
我的问题是:
如何抓取每个元素的标签属性(标题、href 等)? 具体来说,对于“A”标签,如何抓取“href”属性?
Enum READYSTATE
READYSTATE_UNINITIALIZED = 0
READYSTATE_LOADING = 1
READYSTATE_LOADED = 2
READYSTATE_INTERACTIVE = 3
READYSTATE_COMPLETE = 4
End Enum
Dim ie As InternetExplorer
Dim html As HTMLDocument
Dim RowNumber As Integer
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate "www.somesite.com"
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Connecting..."
DoEvents
Loop
Set html = ie.document
RowNumber = 1
For Each element In html.all
Cells(RowNumber, "A").Value = element.tagName
Cells(RowNumber, "B").Value = element.ID
Cells(RowNumber, "C").Value = element.className
Cells(RowNumber, "D").Value = element.innerHTML
RowNumber = RowNumber + 1
Next element
任何帮助将不胜感激。
【问题讨论】:
标签: vba web-scraping attributes href