【问题标题】:Error duplicating data when scraping from Web to Excel从 Web 抓取到 Excel 时复制数据时出错
【发布时间】:2020-08-08 18:20:40
【问题描述】:

我正在寻求您对我的编码以将数据从 Web 提取到 Excel 的帮助。

Web 获取数据:

https://eport.saigonnewport.com.vn/Pages/Common/Containers_new

  • 获取数据的步骤:

将“Cát Lái”放入“Khu vực giao nhận container”字段(选择海港)

将容器编号输入“容器”字段

取消选择“Chỉ vòng luân chuyển cuối”以显示数据表中的所有行

点击搜索显示数据表-容器信息搜索结果

问题: Excel 中每一行的数据从 Web 抓取到 Excel(分别为 找到的每个集装箱号)似乎与前一个相同 结果此容器编号的 WHILE 信息可以为空。为了 示例:事件时间 2“10/4/2020 3:07:00 PM”重复 容器“TEMU3311320”,而此容器没有事件时间 2.

希望您能给我任何建议来解决这个重复问题。附加的 Excel 文件供您参考。谢谢。

Sub PullDataFromWeb()
  Dim IE As Object, W As Excel.Worksheet
  Dim doc As HTMLDocument
  Dim lastRow As Integer, b As Boolean, tmp As String
  Dim lis, li
  Set W = ThisWorkbook.Sheets("Sheet1")
  Set IE = VBA.CreateObject("InternetExplorer.Application")
  IE.Visible = True   'hien cua so IE
  IE.navigate "https://eport.saigonnewport.com.vn/Pages/Common/Containers_new"
  Do While IE.Busy Or IE.readyState <> 4      'doi IE chay xong
    Application.Wait DateAdd("s", 1, Now)
  Loop
  Set doc = IE.document

  lastRow = W.Range("B" & W.UsedRange.Rows.Count + 2).End(xlUp).Row        'dong cuoi cung trong cot B container
  If lastRow < 2 Then GoTo Ends
  On Error Resume Next
  For intRow = 2 To lastRow     'tu dong toi dong
    b = False
    b = W.Range("I" & intRow).Value Like "[Yy]"
    If W.Range("B" & intRow).Value <> "" And Not b Then
      doc.getElementById("txtItemNo_I").Value = W.Range("B" & intRow).Value 'so cont
      doc.getElementById("cbSite_VI").Value = W.Range("A" & intRow).Value
      doc.getElementById("chkInYard_I").Checked = False
      doc.getElementById("ContentPlaceHolder2_btnSearch").Click 'click Search
      '----------------------------------------------
      Do While IE.Busy Or IE.readyState <> 4
        Application.Wait DateAdd("s", 1, Now)
      Loop
      '----------------------------------------------
      strFindContainer = doc.getElementById("ContentPlaceHolder2_lblNotice").innerText
      W.Range("H" & intRow) = strFindContainer
      If strFindContainer Like "T*m th*y * container*" Then
        strEventtime1 = doc.getElementById("grdContainer_DXDataRow0").Cells(0).innerText
        strEventtype1 = doc.getElementById("grdContainer_DXDataRow0").Cells(1).innerText
        strLocation1 = doc.getElementById("grdContainer_DXDataRow0").Cells(2).innerText
        strEventtime2 = doc.getElementById("grdContainer_DXDataRow1").Cells(0).innerText
        strEventtype2 = doc.getElementById("grdContainer_DXDataRow1").Cells(1).innerText
        W.Range("C" & intRow) _
          .Resize(, 5).Value = Array(strEventtime1, strEventtype1, strLocation1, _
                         strEventtime2, strEventtype2)
      End If
    End If
  Next
Ends:
  IE.Quit
  Set IE = Nothing    'Cleaning up
  Set objElement = Nothing
  Set objCollection = Nothing
  Application.StatusBar = ""
  Application.DisplayAlerts = True
End Sub

【问题讨论】:

    标签: excel vba web-scraping


    【解决方案1】:

    在最后一个Next 之前,确保将所有相关的字符串变量分配给vbNullstring,即Array(strEventtime1, strEventtype1, strLocation1, strEventtime2, strEventtype2) 中的变量,因为它在If 内,然后当If 未满足之前的值时,将在以后的循环中保留迭代。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-01-11
      • 1970-01-01
      • 2022-11-17
      • 2022-12-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多