【问题标题】:Accessing a website's table with a WinHTTPRequest in Excel VBA在 Excel VBA 中使用 WinHTTPRequest 访问网站的表格
【发布时间】:2018-06-10 03:51:42
【问题描述】:

我编写了从网站抓取表格并从该表格中提取每个单元格然后将它们放入 Excel 电子表格的代码。当网站正确加载时,代码可以完美运行。

问题是网站不能很好地与 Internet Explorer 配合使用,因此代码只能成功执行大约一半的时间。我可以编写一个例程来检查网站是否加载成功,如果没有成功则重复,但是我想看看我是否可以让它与 WinHTTPRequest 一起工作。

以下几行是我如何使用基于 Internet Explorer 的网页抓取来访问表格,最后一行是我如何将表格加载到变量中。

Set IE = CreateObject("InternetExplorer.Application")
IE.navigate "https://weather.com/weather/tenday/l/12345:4:US"
IE.Visible = True

Application.Wait (Now() + TimeValue("00:02:00"))

Set doc = IE.document

Set WeatherTable = doc.getElementsByClassName("twc-table")(0)

我可以使用下面的代码通过 WinHTTPRequest 加载相关网站。

Set doc = New HTMLDocument

With CreateObject("WINHTTP.WinHTTPRequest.5.1")
    .Open "GET", "https://weather.com/weather/tenday/l/12345:4:US", False
    .send
    doc.body.innerHTML = .responseText
End With

但是,当我尝试使用下面的行抓取表格时,我得到“运行时错误'438':对象不支持此属性或方法。

Set WeatherTable = doc.getElementByclassname("twc-table")(0)

基本上,我需要与此行等效的 WinHTTP 网页抓取。

我已经看过 html 文档(doc.body.all.item(1) 等)的下降,但在遇到错误之前我并没有走得太远。我也看过 Selenium 插件,但我不记得能够成功下载和安装它,而且我不确定它是否仍然适用于当前版本的 chrome / firefox。

这是完整的代码,它允许我通过 Internet Explorer 网页抓取获取表格,然后将其放到 Excel 电子表格中。

感谢任何帮助。

Sub GetTable2()

Dim IE As Object
Dim doc As HTMLDocument
Dim WeatherTable As HTMLTable
Dim WeatherTableRows As HTMLTableRow
Dim HTMLTableCell As HTMLTableCell
Dim HeaderRow As Boolean

Dim RowCount As Long
Dim ColumnCount As Long

Dim i As Long

RowCount = 1
ColumnCount = 1
HeaderRow = True

Set IE = CreateObject("InternetExplorer.Application")
IE.navigate "https://weather.com/weather/tenday/l/12345:4:US"
IE.Visible = True

'Application.Wait (Now() + TimeValue("00:02:00"))

Set doc = IE.document

Set WeatherTable = doc.getElementsByClassName("twc-table")(0)

    For Each WeatherTableRows In WeatherTable.Rows
        i = 1
        For Each HTMLTableCell In WeatherTableRows.Cells
            If HeaderRow = True Then
                ThisWorkbook.Sheets("Sheet5").Cells(RowCount, ColumnCount).Value = HTMLTableCell.innerText
                ColumnCount = ColumnCount + 1
            Else
                If i = 1 Then
                    i = i + 1
                Else
                    ThisWorkbook.Sheets("Sheet5").Cells(RowCount, ColumnCount).Value = HTMLTableCell.innerText
                    ColumnCount = ColumnCount + 1
                End If
            End If
        Next HTMLTableCell
        HeaderRow = False
    ColumnCount = 1
    RowCount = RowCount + 1
    Next WeatherTableRows

IE.Quit
Set IE = Nothing
Set doc = Nothing

End Sub

【问题讨论】:

    标签: html excel vba internet-explorer


    【解决方案1】:

    您错过了s。它是复数,因为您通过 className 获得 elements 的集合。

    Set WeatherTable = doc.getElementsByClassName("twc-table")(0)
    

    【讨论】:

    • 哇,好的...非常感谢您的快速回复!
    【解决方案2】:

    为了让你的方法稍微干净一些,你也可以试试这个方法。

    Sub FetchTabularData()
        Dim elem As Object, trow As Object, S$, R&, C&
    
        [B1:G1] = [{"Day","Description","High/Low","Precip","Wind","Humidity"}]
    
        With New WinHttp.WinHttpRequest
            .Open "GET", "https://weather.com/weather/tenday/l/12345:4:US", False
            .send
            S = .responseText
        End With
    
        With New HTMLDocument
            .body.innerHTML = S
    
            For Each elem In .querySelector(".twc-table").getElementsByTagName("tr")
                For Each trow In elem.getElementsByTagName("td")
                    C = C + 1: Cells(R + 1, C) = trow.innerText
                Next trow
                C = 0: R = R + 1
            Next elem
        End With
    End Sub
    

    参考添加:

    Microsoft HTML Object Library
    Microsoft WinHTTP Services, version 5.1
    

    【讨论】:

      猜你喜欢
      • 2015-11-16
      • 1970-01-01
      • 2011-09-03
      • 2021-11-25
      • 2013-08-19
      • 2020-09-05
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多