【发布时间】:2019-09-21 05:57:24
【问题描述】:
以下是我发现并更改的一些代码,以尝试从亚马逊的搜索栏中捕获关键字/短语建议。我对网络抓取的概念非常陌生,所以我知道这里提供的代码可能非常低效和低效。我已经从 F12 DOM Explorer 和 Network 窗口手动捕获了一些数据。如果最好的答案是网络抓取,我需要 excel vba 的形式。我在下面的一些图像中看到,网络窗口中的某些内容类型似乎是“application/json”,而 Initiator/Type 是“XMLHttpRequest”,但这只是在它显示连接和身份验证之后“https://completion.amazon.com”。如果那是路线,我不知道如何完成这些请求。任何帮助将非常感激。
到目前为止,我已经尝试通过代码中的脚本以编程方式调用搜索栏,但我看不到任何效果。简单地将关键字“粘贴”到搜索栏中并附加一个“空格”不会产生建议的关键字。但是,在搜索栏中输入即可。如果我输入关键字,然后选择下拉建议的“检查元素”,则会生成动态 HTML 以显示建议的 HTML 内容(此时我可以得到我需要的内容)。我一直没有成功达到这一点。
私有子命令按钮1_Click()
Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim AASearchRank As Workbook
Dim AAws As Worksheet
Dim InputSearch As HTMLInputTextElement
Dim elems As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim elems2 As IHTMLElementCollection
Dim TDelement2 As HTMLDivElement
'Dim TDelement2 As HTMLInputTextElement
Dim InputSearchButton As HTMLInputButtonElement
Dim IE As InternetExplorer
Dim x As Integer
Dim i As Long
MyURL = "https://www.amazon.com/"
Set IE = New InternetExplorer
With IE
.Silent = True
.Navigate MyURL
.Visible = True
Do
DoEvents
Loop Until .ReadyState = READYSTATE_COMPLETE
End With
Set HTMLDoc = IE.Document
Set AASearchRank = Application.ThisWorkbook
Set AAws = AASearchRank.Worksheets("Sheet2")
Set InputSearchButton = HTMLDoc.getElementById("nav-search-submit-text")
Set InputSearchOrder = HTMLDoc.getElementById("twotabsearchtextbox")
If Not InputSearchOrder Is Nothing Then
InputSearchButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
End If
x = 2
If AAws.Range("D" & x).Value = "" Then
Do Until AAws.Range("B" & x) = ""
Set InputSearch = HTMLDoc.getElementById("twotabsearchtextbox")
InputSearch.Focus
'When a keyword is typed in the search bar with a 'space' after, it invokes the suggestions I'm looking for.
InputSearch.Value = "Travel "
'InputSearch.Value = AAws.Range("C" & x) & " "
Set InputSearchButton = HTMLDoc.getElementsByClassName("nav-input")(0)
InputSearch.Focus
'Here I was trying to invoke some script to see if it had any effect on the search bar drop down
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'UpNav',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'Search',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'NavBar',end:+new Date(),begin:window.navmet.main});"
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
'Application.Wait (Now + TimeValue("0:00:05"))
Set elems2 = HTMLDoc.getElementsByClassName("nav-issFlyout nav-flyout")
i = 0
For Each TDelement2 In elems2
'Debug statements strictly for learning what each option/query returns
Debug.Print TDelement2.innerText
Debug.Print TDelement2.className
Debug.Print TDelement2.dataFld
Debug.Print TDelement2.innerHTML
Debug.Print TDelement2.outerText
Debug.Print TDelement2.outerHTML
Debug.Print TDelement2.parentElement.className
Debug.Print TDelement2.tagName
Debug.Print TDelement2.ID
Next
'Once the searchbar is populated, and the drop down list provides suggestions,
'the below code will give me what I want. If there's an easier solution,
'I'm all for it
Set elems = HTMLDoc.getElementsByClassName("s-suggestion")
i = 0
For Each TDelement In elems
If Left(TDelement.ID, 6) = "issDiv" Then
Debug.Print TDelement.innerText
Debug.Print TDelement.ID
End If
Next
x = x + 1
Loop
End If
结束子
一个理想的解决方案是通过调用搜索栏动态 HTML 或通过亚马逊的补全站点来获取这些建议的关键字,但似乎这可能不对公众开放。感谢您提供的任何帮助,并为任何发布缺陷提前道歉。
【问题讨论】:
标签: html excel vba dom web-scraping