【发布时间】:2021-02-01 22:35:03
【问题描述】:
我正在尝试设置一个网页抓取 VBA 代码以将数据从该网站导入 Excel:https://www.thewindpower.net/windfarms_list_en.php
我希望启动这个网页,选择一个国家,然后从下表中抓取数据(包括名称列中的 url)。
然而,我有几个问题:
- 如何在 VBA 代码中选择我希望的国家/地区?
- 标签中没有id或class,如何选择表格?
- 如何导入名称列中包含的 URL?
这是我已经准备好的代码(基于网络上的一些研究:
Sub Grabdata()
'dimension (set aside memory for) our variables
Dim objIE As InternetExplorer
Dim ele As Object
Dim y As Integer
'start a new browser instance
Set objIE = New InternetExplorer
'make browser visible
objIE.Visible = True
'navigate to page with needed data
objIE.navigate "https://www.thewindpower.net/windfarms_list_en.php"
'wait for page to load
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'we will output data to excel, starting on row 1
y = 1
'look at all the 'tr' elements in the 'table' with id 'myTable',
'and evaluate each, one at a time, using 'ele' variable
For Each ele In objIE.document.getElementById("myTable").getElementsByTagName("tr")
'show the text content of 'tr' element being looked at
Debug.Print ele.textContent
'each 'tr' (table row) element contains 4 children ('td') elements
'put text of 1st 'td' in col A
Sheets("Sheet1").Range("A" & y).Value = ele.Children(0).textContent
'put text of 2nd 'td' in col B
Sheets("Sheet1").Range("B" & y).Value = ele.Children(1).textContent
'put text of 3rd 'td' in col C
Sheets("Sheet1").Range("C" & y).Value = ele.Children(2).textContent
'put text of 4th 'td' in col D
Sheets("Sheet1").Range("D" & y).Value = ele.Children(3).textContent
'increment row counter by 1
y = y + 1
'repeat until last ele has been evaluated
Next
'save the Excel workbook
ActiveWorkbook.Save
结束子
【问题讨论】:
-
你会怎么做选择表?您愿意从该表中获取内容吗?你不需要IE。您只需发送一个 post http 请求以及适当的参数来填充表格。
-
我想根据我选择的国家/地区获取所有表格(例如英国)。在获得此表(名称列中包含 URL)后,我希望通过访问名称上的每个 URL 并从那里获取一些数据,为表的每一行运行另一个宏
标签: excel vba web-scraping