【发布时间】:2018-01-17 11:11:56
【问题描述】:
情况:
我正在从网页NHS Delayed Transfers of Care 下载文件。
在 HTML 中我可以看到以下内容:
onclick="ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');"
在查看here 并看到这些 SO 问题(以及其他问题)之后:
- Click button or execute JavaScript function with VBA
- How to find and call javascript method from vba
- Call a javascript function
我的印象是 ga() 是一个 JavaScript 函数,我应该可以直接用 .execScript 调用。
问题:
我可以使用.execScript 执行JavaScript 函数来下载文件吗?如果没有,我该如何下载文件?
我尝试过的:
我尝试了以下失败:
1) Call html.parentWindow.execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript")
'-2147352319 自动化错误
2)Call html.frames(0).execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript")
错误 438 对象不支持此属性或方法
3)Call currentWindow.execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript")
错误 91 对象变量或未设置块变量
4)Call CurrentWindow.execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript")
-2147352319 由于错误 80020101,无法完成操作。
我承认对这类操作知之甚少。谁能看看我哪里出错了?
代码:
Option Explicit
Public Sub DownloadDTOC()
Dim http As New XMLHTTP60
Dim html As New HTMLDocument
Dim CurrentWindow As HTMLWindowProxy
With http
.Open "GET", "https://www.england.nhs.uk/statistics/statistical-work-areas/delayed-transfers-of-care/delayed-transfers-of-care-data-2017-18/", False
.send
html.body.innerHTML = .responseText
End With
On Error GoTo Errhand
'Call html.parentWindow.execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript") '-2147352319 Automation error
'Call html.frames(0).execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript") '438 Object doesn't support this property or method
'automation error
'Call currentWindow.execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript") ' 91 Object variable or With block variable not set
Set CurrentWindow = html.parentWindow
Call CurrentWindow.execScript("ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');", "Javascript") '--2147352319 Could not complete the operation due to error 80020101.
Exit Sub
Errhand:
If Err.Number <> 0 Then Debug.Print Err.Number, Err.Description
End Sub
已添加参考:
这是 HTML 的简化版本。抱歉,我不习惯格式化 HTML。
<p>
<a href="https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls" class="xls-link" onclick="ga('send', 'event', 'Downloads', 'XLS', 'https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/01/LA-Type-B-November-2017-2ayZP.xls');">Total Delayed Days Local Authority 2017-18 November (XLS, 121KB)</a>
<br>
</p>
【问题讨论】:
-
我知道 IE 可以很好地处理
.execScript。您是否尝试过通过隐藏的 IE 窗口打开它然后执行您的脚本? -
您是否尝试过在
xls-link类中获取文本?onclick在该类中也可用。但是,我想说的是xmlhttp60请求将无法从该页面获取任何内容,因为它甚至无法解析该类中的文本。该网站的内容是动态生成的。你应该去IE。 -
我会尝试使用 IE。我故意避开,因为它很慢。
-
@Shahin 顺便说一句,当我尝试使用“xls-link”通过 className 获取时,没有返回任何内容。这与 .OuterHTML 与 .Inner 有关系吗?
-
那个ga()只是调用google分析,不会影响下载,真的需要调用吗?
标签: javascript html vba excel web-scraping