【问题标题】:How to extract a full table which loads further when scrolled down?如何提取向下滚动时进一步加载的完整表?
【发布时间】:2019-11-04 01:51:01
【问题描述】:

我正在尝试从一个网站中提取最高可用历史价格,当您向下滚动页面时,该网站会将更多信息加载到表格中。我目前使用的这段代码只会提取前 100 行。

非常感谢您的帮助。

谢谢。

Sub pullhistoricalprice()

    Dim xmlpage As New MSXML2.XMLHTTP60
    Dim htmldoc As New MSHTML.HTMLDocument
    Dim objTable As Object
    Dim lRow As Long
    Dim lngTable As Long
    Dim lngRow As Long
    Dim lngCol As Long
    Dim ActRw As Long

    xmlpage.Open "GET", "https://finance.yahoo.com/quote/AAPL/history?period1=345400200&period2=1561046400&interval=1d&filter=history&frequency=1d", False
    xmlpage.send
    htmldoc.body.innerHTML = xmlpage.responseText
    htmldoc.parentWindow.scrollBy 0, 100
    Application.Wait Now + TimeValue("00:00:03")

    With htmldoc.body
        Set objTable = .getElementsByTagName("table")
        For lngTable = 2 To objTable.Length - 2
            For lngRow = 0 To objTable(lngTable).Rows.Length - 1
                For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
                    ThisWorkbook.Sheets("Sheet1").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
                Next lngCol
            Next lngRow
            ActRw = ActRw + objTable(lngTable).Rows.Length + 1
        Next lngTable
    End With

End Sub

【问题讨论】:

  • 它来自表格的哪一列?此外,您在代码中提取了多个值,而 max 应该是一个值,不是吗?

标签: html excel vba web-scraping


【解决方案1】:

您可以通过按下下载按钮获取当前 cookie 的任何内容,然后通过 WinHttp 请求将其传递给该文件,然后通过二进制下载下载该文件。

Cookie 位于记录请求的标头中(开发工具 - 按 F12 打开,转到网络选项卡按下载) - 找到发出的请求并查看 cookie

理论上,可以通过事先请求提取此 cookie。否则,此代码仅作为基于时间的解决方案,受 cookie 到期日期限制。

Public Sub Test()
    DownloadFile "C:\Users\User\Desktop\", "https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=345400200&period2=1561046400&interval=1d&events=history&crumb=gdzXUTfT0l/"
End Sub
Public Function DownloadFile(ByVal downloadFolder As String, ByVal downloadURL As String) As String
    Dim http As Object, tempArr As Variant
    Set http = CreateObject("WinHttp.WinHttpRequest.5.1")
    http.Open "GET", downloadURL, False
    http.setrequestheader "cookie", "B=bujjnjpeetmah; APID=UP86674562-8245-11e9-8433-020167e61c30; PRF=t%3DAAPL%252BASLN%252BA; GUCS=AQBqKRDM; GUC=AQABAQFdDhdd3UImCwVo&s=AQAAADRx8bGn&g=XQzIlw"
    http.send
    On Error GoTo errhand
    With CreateObject("ADODB.Stream")
        .Open
        .Type = 1
        .write http.responseBody
        .SaveToFile downloadFolder & "data.csv", 2  '< "/" on enter of downloadFolder. 2 for overwrite which is Ok if no file modifications.
        .Close
    End With
    DownloadFile = downloadFolder & tempArr
    Exit Function
errhand:
    If Err.Number <> 0 Then
        Debug.Print Err.Number, Err.Description
        MsgBox "Download failed"
    End If
    DownloadFile = vbNullString
End Function

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-10-06
    • 1970-01-01
    • 1970-01-01
    • 2021-10-07
    • 2019-05-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多