【问题标题】:VBa Excel IE cant find Table in innerHTMLVBa Excel IE在innerHTML中找不到表格
【发布时间】:2015-04-10 00:48:21
【问题描述】:

我正在尝试从网页复制表格。我无法复制整个页面,因为它具有按钮和动态元素,并且由于内存过载而将它们粘贴到工作表中会破坏代码,所以我试图拉出 HTML 并将表格粘贴到 Excel 中。

当我将整个源代码文本复制到 Word 中时,它告诉我有大约 23k 个字母,但是当我使用 innerHTML 或 outerHTML 时,它们的长度都在 15-16k 左右。

我知道内部和外部在 HTML 正文之外缺少很多函数等,但令我困惑的是它们缺少代码中间我需要的表格。

网站代码:

<div class="row" >
            <div class="col-lg-12 col-md-12 col-sm-12" >


            </div>

                    <div class="col-lg-12 col-md-12 col-sm-12" >
                        <table class="table table-hover table-bordered table-striped " >
                            <thead>
                                <tr style="background:#eee">
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=day&amp;order=asc">Date</a></th>



                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=jobs&amp;order=asc">Current Jobs Listed</a></th>

                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=impressions&amp;order=asc">Impressions</a></th>
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=clicks&amp;order=asc">Clicks</a></th>

                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=cpc&amp;order=asc">CPC</a></th>
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=ctr&amp;order=asc">CTR</a></th>
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=cost&amp;order=asc">Estimated cost</a></th>

                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=daily_budget&amp;order=asc">Current Daily Budget</a></th>
                                    <th style="vertical-align:top" ><a href="#" onclick="return false;">Edit Campaign</a></th>
                                    <th style="vertical-align:top" ></th>

                                </tr>
                            </thead>
                            <tbody>

                                    <tr class="odd 2015-03-11">
                                        <td>2015-03-11</td>



                                        <td class="jobsListed" >437879</td>

                                        <td>148397</td>
                                        <td>1379</td>

                                        <td>$0.36</td>
                                        <td>0.93%</td>
                                        <td >$491.16</td>

                                        <td class="dailyBudget">$15500.00</td>
                                        <td ><a href="/employer/campaign/">Edit</a></td>

                                    </tr>

                                <tr class="dg" >

                                    <td  colspan="1"  class="text-right"><b>Total:</b></td>

                                    <td class="jobsListed" >437879</td>

                                    <td>148397</td>
                                    <td>1379</td>

                                    <td>$0.36</td>
                                    <td>0.93%</td>
                                    <td >$491.16</td>

                                    <td class="dailyBudget">$15500.00</td>
                                    <td ></td>
                                    <td ></td>
                                </tr>
                            </tbody>
                        </table>
                    </div>

        </div>


        </div><!--container ends here -->

这是我尝试获取表格数据的方式:

Dim appIE As Object ' InternetExplorer.Application
Set appIE = CreateObject("InternetExplorer.Application")


    Dim strSource As String
    Dim TableString As String
    strSource = CStr(appIE.document.body.outerHTML)
    TableString = Mid(strSource, _
    InStr(strSource, "<table"), _
    InStr(strSource, "</table>") - InStr(strSource, "<table"))

    Dim ClipBoard As New DataObject
    ClipBoard.SetText TableString
    ClipBoard.PutInClipboard

它给了我一个错误,因为它在字符串中找不到&lt;table。我在字符串中踩了几下,发现table所在的空间应该是这样的:

 class="col-lg-12 col-md-12 col-sm-12">


            </div>

        </div>


        </div><!--container ends here -->

有什么想法吗?谢谢

【问题讨论】:

  • 也许表格是动态的,除非您“悬停”在某个区域上,否则不会提供服务?这只是基于描述性类名的疯狂猜测,没有 URL,任何人都很难提供具体的帮助。我会问你为什么使用字符串函数来解析 HTML?为 HTML 或 XML 文档使用适当的 DOM 解析器,这些解析器具有很棒的方法,例如 .getElementsByClassName 和其他设计专门用于遍历 XML/HTML 树中的节点的方法。
  • 我试图通过类名来获取它,类似于:strSource = cstr(appIE.document.getElementsByTagName("table table-hover table-bordered table-striped").innerhtml) 但它仍然给我一个空表

标签: html excel vba internet-explorer


【解决方案1】:

我终于弄清楚问题所在了!

IE 正在视觉上加载页面,但机器仍然认为它在登录屏幕上。我能够看到这一点的方式是通过即时窗口中的appIE.LocationURL

所以它在页面上找不到表是有道理的,因为它在登录页面上不存在。

这个问题的解决方法真的很简单。

  1. 检查机器是否认为它在登录页面或数据页面
  2. 如果不是登录 - 重新创建 IE 应用程序并加载新窗口 和页面。 (但是,这将已经登录,因此请确保 有一个检查,看看你是否已经登录。我这样做了 在页面上查找我知道仅在 数据页面,而不是登录页面。)
  3. 确保关闭所有 IE 窗口 - 因为现在只需执行 appIE.Quit 只会关闭最近的寡妇。
  4. 确保您有一个错误句柄,否则这可能会永远循环。

代码:

MakeIE:
set appIE = CreateObject("InternetExplorer.Application")
...
With appIE
    .Navigate sURL
    Application.Wait (Now + TimeValue("00:00:01"))
    .Visible = True
    .Height = 500
    .Width = 500
    Application.Wait (Now + TimeValue("00:00:01"))
    ' loop until the page finishes loading
    Do Until .ReadyState = 4: DoEvents: Loop
End With
....
If appIE.LocationURL <> sURL Then GoTo MakeIE

杀死所有 IE 窗口的代码(谨慎使用 - 将杀死所有 IE):

Option Explicit
Sub IE_Sledgehammer()
    Dim objWMI As Object, objProcess As Object, objProcesses As Object
    Set objWMI = GetObject("winmgmts://.")
    Set objProcesses = objWMI.ExecQuery( _
        "SELECT * FROM Win32_Process WHERE Name = 'iexplore.exe'")
    For Each objProcess In objProcesses
        On Error Resume Next
        Call objProcess.Terminate
        On Error GoTo 0
    Next
    Application.Wait (Now + TimeValue("0:00:03"))
    Set objProcesses = Nothing: Set objWMI = Nothing
    Application.Wait (Now + TimeValue("0:00:03"))
End Sub

【讨论】:

    猜你喜欢
    • 2017-03-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-02-03
    • 1970-01-01
    相关资源
    最近更新 更多