【问题标题】:Convert HTML-table to Excel using VBA使用 VBA 将 HTML 表转换为 Excel
【发布时间】:2018-06-17 12:03:32
【问题描述】:

将 HTML 表格转换为 Excel

下面的代码获取位于https://rasmusrhl.github.io/stuff 的 HTML 表,并将其转换为 Excel 格式。

问题是:

  • 括号中的数字转换为负数
  • 数字被四舍五入或截断

解决方案

感谢大家的伟大贡献。各种各样的答案帮助我理解,就我的目的而言,解决方法是最好的 解决方案:因为我自己生成HTML表格,所以我可以控制每个单元格的CSS。存在 CSS 代码来指示 Excel 如何 解释单元格内容:http://cosicimiento.blogspot.dk/2008/11/styling-excel-cells-with-mso-number.html,在此也有解释 提问:Format HTML table cell so that Excel formats as text?

在我的例子中,CSS 应该是文本,即mso-number-format:\"\\@\"。它集成在下面的 R 代码中:

library(htmlTable)
library(nycflights13)
library(dplyr)

nycflights13::planes %>% 
    slice(1:10) %>% mutate( seats = seats*1.0001,
                            s1    = c("1-5", "5-10", "1/2", "1/10", "2-3", "1", "1.0", "01", "01.00", "asfdkjlæ" ),
                            s2    = c("(10)", "(12)", "(234)", "(00)", "(01)", "(098)", "(01)", "(01.)", "(001.0)", "()" )) -> df 


rle_man <- rle(df$manufacturer)

css_matrix <- matrix( data = "mso-number-format:\"\\@\"", nrow = nrow(df), ncol = ncol(df))
css_matrix[,1] <- "padding-left: 0.4cm;mso-number-format:\"\\@\""
css_matrix[,2:10] <- "padding-left: 1cm;mso-number-format:\"\\@\""
css_matrix[,5] <- "padding-left: 2cm;mso-number-format:\"\\@\""


htmlTable( x = df,  
           rgroup   = rle_man$values, n.rgroup = rle_man$lengths, 
           rnames   = FALSE, align = c("l", "r" ), 
           cgroup   =  rbind(  c("", "Some text goes here. It is long and does not break", "Other text goes here", NA),
                               c( "", "Machine type<br>(make)", "Specification of machine", "Other variables")),
           n.cgroup = rbind(   c(1,8,2, NA),
                               c(1, 3, 5, 2)), 
           css.cell = css_matrix )            -> html_out

temp_file <- tempfile( pattern = "table", fileext = ".html" )
readr::write_file( x = html_out, path = temp_file)
utils::browseURL( temp_file)

可以将该 HTML 文件拖放到 Excel 中,并将所有单元格解释为文本。请注意,只有将 html-file 拖放到 excel 中有效,在浏览器中打开表格并将其复制粘贴到 excel 中不起作用。

这种方法唯一缺少的是水平线,但我可以忍受。

下面是VBA,效果和拖拽一样:

Sub importhtml()
'
' importhtml Macro
'

'
With ActiveSheet.QueryTables.Add(Connection:= _
                                 "URL;file:///C:/Users/INSERTUSERNAME/Desktop/table18b85c0a20f3html.HTML", Destination:=Range("$a$1"))

.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingAll
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = True
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With

End Sub

【问题讨论】:

  • Excel 数字格式 - 是否设置为会计格式或类似的格式?如果是这样,那么括号/括号中的数字是负数...请参阅MSDNL: WebFormatting property - 你试过 .WebFormatting = xlWebFormattingNone
  • 在源 html 中(需要的地方)添加mso-number-format 样式不是更简单吗?
  • @CommonSense 也许是这样。我应该在 HTML 表格的每个单元格上附加什么内容,才能让 Excel 将其视为文本?
  • 您确定 () 中的数字不代表负值吗?
  • @Rasmus Larsen :我已经用服务器端 R 解决方案更新了我的答案。享受吧。

标签: html r vba excel


【解决方案1】:

使用 url https://rasmusrhl.github.io/stuff,幸运的是 Excel 可以直接打开它并保存为 .xlsx(在繁琐的过程之前为什么没有人尝试这个)。如果直接打开失败,这里的所有其他方法都是不错的选择!

Option Explicit

Sub OpenWebFile()
    Const URL As String = "https://rasmusrhl.github.io/stuff"
    Dim oWB As Workbook
    On Error Resume Next
    Set oWB = Workbooks.Open(Filename:=URL, ReadOnly:=True)
    If oWB Is Nothing Then
        MsgBox "Cannot open the url " & URL, vbExclamation + vbOKOnly, "ERR " & Err.Number & ":" & Err.Description
        Err.Clear
    Else
        ' Change to your desired path and filename
        oWB.SaveAs Filename:="C:\Test\stuff.xlsx", FileFormat:=xlOpenXMLWorkbook
        Set oWB = Nothing
    End If
End Sub

【讨论】:

  • 谢谢。该方案的问题在于:括号内的数字被转换为负数,数字被四舍五入或截断。
【解决方案2】:

处理 HTML,然后将其复制并粘贴到 Excel 中将

以下是我使用的步骤:

  • CreateObject("MSXML2.XMLHTTP"): 获取 URL 的 responseText
  • CreateObject("HTMLFile"): 从 responseText 创建一个 HTML 文档
  • 用黑色替换灰色以使边框变暗
  • 在列 s1 和 s2 前加上 @ 以保留格式
  • 将 HTML 复制到 Windows 剪贴板
    • 注意:HTML 需要包含在 HTML 和 Body 标签中才能正确粘贴
  • 设置目标工作表
  • 将 HTML 粘贴到工作表中
  • @ 符号替换为'
    • 注意:这会通过将数据存储为文本来保留格式
  • 完成工作表的格式化


Sub LoadTable()
    Const URL = "https://rasmusrhl.github.io/stuff/"
    Dim x As Long
    Dim doc As Object, tbl As Object, rw As Object
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .send
        If .readyState = 4 And .Status = 200 Then
            Set doc = CreateObject("HTMLFile")
            doc.body.innerHTML = .responseText
            doc.body.innerHTML = Replace(doc.body.innerHTML, "grey", "black")
            Set tbl = doc.getElementsByTagName("TABLE")(0)

            For x = 0 To tbl.Rows.Length - 1
                Set rw = tbl.Rows(x)

                If rw.Cells.Length = 14 Then
                    'If InStr(rw.Cells(12).innerText, "-") Or InStr(rw.Cells(12).innerText, "/") Then
                    rw.Cells(12).innerText = "@" & rw.Cells(12).innerText
                    rw.Cells(13).innerText = "@" & rw.Cells(13).innerText
                End If
            Next

            With CreateObject("new:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
                .SetText "<html><body>" & doc.body.innerHTML & "</body></html>"
                .PutInClipboard
            End With

            With Worksheets("Sheet1")
                .Cells.Clear
                .Range("A1").PasteSpecial
                .Cells.Interior.Color = vbWhite
                .Cells.WrapText = False
                .Columns.AutoFit
                .Columns("M:N").Replace What:="@", Replacement:="'"
            End With

        Else
            MsgBox "URL:  " & vbCrLf & "Ready state: " & .readyState & vbCrLf & "HTTP request status: " & .Status, vbInformation, "URL Not Responding"
        End If
    End With
End Sub

【讨论】:

  • 嗨,CreateObject("new:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") 看起来和New MSForms.DataObject 一样,请问您为什么要这样做?是保存 Tools->Reference 吗?前几天我在一个网络浏览器上看到了它,并对它感到好奇。
  • 这是我个人宏工作簿中的一个代码 sn-p。每当我需要使用剪贴板时,我都会使用它而无需参考 MSForms 库。我使用早期绑定 sn-ps 编写了原始版本来处理 HTML,但将其转换为后期绑定,因此 OP 只需在运行代码之前修复 Worksheet 引用。
  • 好的,ta。我真的很喜欢通过GetObjectCreateObject 压缩的各种语法。谢谢。
【解决方案3】:
<style type=text/css>
    td {mso-number-format: '\@';}
</style>
<table ...

将上述单元格 (&lt;td&gt;s) 的全局样式定义放在您使用 R 生成的输出上在客户端重写文档,如下所示。

Sub importhtml()
    '*********** HTML document rewrite process ***************
    Const TableUrl = "https://rasmusrhl.github.io/stuff"

    Const adTypeBinary = 1, adSaveCreateOverWrite = 2, TemporaryFolder = 2
    Dim tempFilePath, binData() As Byte

    With CreateObject("Scripting.FileSystemObject")
        tempFilePath = .BuildPath(.GetSpecialFolder(TemporaryFolder), .GetTempName() & ".html")
    End With

    'download HTML document
    With CreateObject("MSXML2.ServerXMLHTTP")
        .Open "GET", TableUrl, False
        .Send
        If .Status <> 200 Then Err.Raise 3, "importhtml", "200 expected"
        binData = .ResponseBody
    End With

    With CreateObject("Adodb.Stream")
        .Charset = "x-ansi"
        .Open
        .WriteText "<style type=text/css>td {mso-number-format:'\@';}</style>"
        .Position = 0 'move to start
        .Type = adTypeBinary 'change stream type
        .Position = .Size 'move to end
        .Write binData 'append binary data end of stream
        .SaveToFile tempFilePath, adSaveCreateOverWrite 'save temporary file
        .Close
    End With
    '*********** HTML document rewrite process ***************

    With ActiveSheet.QueryTables.Add(Connection:= _
        "URL;" & tempFilePath, Destination:=Range("$A$1"))
        'load HTML document from rewritten local copy

        .Name = "stuff"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = False
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlEntirePage
        .WebFormatting = xlWebFormattingAll
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = True
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False

    End With

    Kill tempFilePath
End Sub

【讨论】:

    【解决方案4】:

    对于客户端解决方案

    所以在第一块代码之后运行这段代码,它会重写最后两列。

    Sub Test2()
        '* tools references ->
        '*   Microsoft HTML Object Library
    
    
        Dim oHtml4 As MSHTML.IHTMLDocument4
        Set oHtml4 = New MSHTML.HTMLDocument
    
        Dim oHtml As MSHTML.HTMLDocument
        Set oHtml = Nothing
    
        '* IHTMLDocument4.createDocumentFromUrl
        '* MSDN - IHTMLDocument4 createDocumentFromUrl method - https://msdn.microsoft.com/en-us/library/aa752523(v=vs.85).aspx
        Set oHtml = oHtml4.createDocumentFromUrl("https://rasmusrhl.github.io/stuff/", "")
        While oHtml.readyState <> "complete"
            DoEvents  '* do not comment this out it is required to break into the code if in infinite loop
        Wend
        Debug.Assert oHtml.readyState = "complete"
    
    
        Dim oTRs As MSHTML.IHTMLDOMChildrenCollection
        Set oTRs = oHtml.querySelectorAll("TR")
        Debug.Assert oTRs.Length = 17
    
        Dim lRowNum As Long
        For lRowNum = 3 To oTRs.Length - 1
    
            Dim oTRLoop As MSHTML.HTMLTableRow
            Set oTRLoop = oTRs.Item(lRowNum)
            If oTRLoop.ChildNodes.Length > 1 Then
    
                Debug.Assert oTRLoop.ChildNodes.Length = 14
    
                Dim oSecondToLastColumn As MSHTML.HTMLTableCell
                Set oSecondToLastColumn = oTRLoop.ChildNodes.Item(12)
    
                ActiveSheet.Cells(lRowNum + 2, 13).Value2 = "'" & oSecondToLastColumn.innerText
    
    
                Dim oLastColumn As MSHTML.HTMLTableCell
                Set oLastColumn = oTRLoop.ChildNodes.Item(13)
    
                ActiveSheet.Cells(lRowNum + 2, 14).Value2 = "'" & oLastColumn.innerText
    
            End If
            'Stop
    
        Next lRowNum
    
        ActiveSheet.Columns("M:M").EntireColumn.AutoFit
        ActiveSheet.Columns("N:N").EntireColumn.AutoFit
    
    
    End Sub
    

    对于服务器端解决方案

    现在我们知道您可以控制源脚本并且它在 R 中,那么您可以更改 R 脚本以使用 mso-number-format:'\@' 设置最后一列的样式。这是一个实现这一点的示例 R 脚本,它构建了一个与数据相同维度的 CSS 矩阵,并将 CSS 矩阵作为参数传递给htmlTable。我没有篡改您的 R 源代码,而是在这里给出一个简单的说明供您解释。

    A=matrix(c("(2)","(4)","(3)","(1)","(5)","(7)"),nrow=2,ncol=3,byrow=TRUE)
    css_matrix <- matrix(data="",nrow=2,ncol=3)
    css_matrix[,3] <- "mso-number-format:\"\\@\""
    htmlTable(x=A,css.cell=css_matrix)
    

    在 Excel 中打开我明白了

    Robin Mackenzie 补充

    您可能会在您的服务器端解决方案中提到 OP 只需要 将 css_matrix[,10:11]

    谢谢罗宾

    【讨论】:

    • @Rasmus:我真的认为不可能按照你的要求去做。如果您愿意,我很乐意调试上述解决方案。您必须转到 Tools->References 并检查库 Microsoft HTML Object Library,如第 2 行和第 3 行的评论中所述。
    • @Rasmus:无论如何,500 的大赏金应该引起注意。祝你好运。
    • 如果您控制源文档当然可以使用mso-number-format:'\@' 强制在样式标签中输入文本。在这种情况下,这显然是最简单、最简单的解决方案。 cosicimiento.blogspot.co.uk/2008/11/…
    • @SMeaden - 您可能会在您的服务器端解决方案中提到,OP 只需将 css_matrix[,10:11] &lt;- "mso-number-format:\"\\@\"" 添加到他们现有的 R 代码中(在最后一行 css_matrix... 之后),它将实现您的解决方案他们的具体问题(同意您的通用解决方案有效)。这值得赏金。 OP 问题列是 11 btw。
    • @RobinMackenzie :干杯,添加并归功于您;)我不敢介入他的 R 脚本。我昨天才下载的。 exceldevelopmentplatform.blogspot.co.uk/2018/01/…
    【解决方案5】:

    要从该页面获取表格数据(保持格式不变),您可以尝试如下:

     Sub Fetch_Data()
        Dim http As New XMLHTTP60, html As New HTMLDocument
        Dim posts As Object, post As Object, elem As Object
        Dim row As Long, col As Long
    
        With http
            .Open "GET", "https://rasmusrhl.github.io/stuff/", False
            .send
            html.body.innerHTML = .responseText
        End With
    
        Set posts = html.getElementsByClassName("gmisc_table")(0)
    
        For Each post In posts.Rows
            For Each elem In post.Cells
                col = col + 1: Cells(row + 1, col).NumberFormat = "@": Cells(row + 1, col) = elem.innerText
            Next elem
            col = 0
            row = row + 1
        Next post
    End Sub
    

    添加到库的参考:

    1. Microsoft HTML Object Library
    2. Microsoft XML, v6.0  'or whatever version you have
    

    这就是该部分在被解析时的样子。

    【讨论】:

    • 谢谢。我正在寻找一个看起来尽可能多的 HTML 的解决方案。包括嵌套标题、斜体和缩进等。
    【解决方案6】:

    您可以尝试一下,看看您是否获得了所需的输出...

    Sub GetWebData()
    Dim IE As Object
    Dim doc As Object
    Dim TRs As Object
    Dim TR As Object
    Dim Cell As Object
    Dim r As Long, c As Long
    
    Application.ScreenUpdating = False
    
    Set IE = CreateObject("InternetExplorer.Application")
    IE.Visible = False
    IE.navigate "https://rasmusrhl.github.io/stuff/"
    Do While IE.Busy Or IE.readyState <> 4
        DoEvents
    Loop
    Set doc = IE.document
    
    Set TRs = doc.getElementsByTagName("tr")
    Cells.Clear
    
    For Each TR In TRs
        r = r + 1
        For Each Cell In TR.Children
            c = c + 1
            Cells(r, c).NumberFormat = "@"
            Cells(r, c) = Cell.innerText
        Next Cell
        c = 0
    Next TR
    IE.Quit
    Columns.AutoFit
    Application.ScreenUpdating = True
    End Sub
    

    解决方案 2:

    要使其工作,您需要通过转到工具(在 VBA 编辑器上)--> 参考来添加以下两个引用,然后找到下面提到的两个引用并选中它们的复选框并单击确定。

    1) Microsoft XML,v6.0(查找可用的最高版本)

    2) Microsoft HTML 对象库

    Sub GetWebData2()
    Dim XMLpage As New MSXML2.XMLHTTP60
    Dim doc As New MSHTML.HTMLDocument
    Dim TRs As IHTMLElementCollection
    Dim TR As IHTMLElement
    Dim Cell As IHTMLElement
    Dim r As Long, c As Long
    
    Application.ScreenUpdating = False
    
    Set XMLpage = CreateObject("MSXML2.XMLHTTP")
    
    XMLpage.Open "GET", "https://rasmusrhl.github.io/stuff/", False
    XMLpage.send
    doc.body.innerhtml = XMLpage.responsetext
    Set TRs = doc.getElementsByTagName("tr")
    Set TRs = doc.getElementsByTagName("tr")
    Cells.Clear
    
    For Each TR In TRs
        r = r + 1
        For Each Cell In TR.Children
            c = c + 1
            Cells(r, c).NumberFormat = "@"
            Cells(r, c) = Cell.innerText
        Next Cell
        c = 0
    Next TR
    Columns.AutoFit
    Application.ScreenUpdating = True
    End Sub
    

    【讨论】:

    • 当我运行这个我得到method busy of object iwebbrowser2 failed
    • 这对我有用,没有问题。我添加了另一个解决方案,看看是否适合您。
    【解决方案7】:

    试试这个,以表格形式导入数据:

    Sub ImportDataAsTable()
        ActiveWorkbook.Queries.Add Name:="Table 0", Formula:= _
            "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""https://rasmusrhl.github.io/stuff/""))," & Chr(13) & "" & Chr(10) & "    Data0 = Source{0}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data0,{{""tailnum"", type text}, {"""", type text}, {""Some text goes here. It is long and does not break Machine type (make) year"", type text}, {""Some text goes here. It is long and does not break Mach" & _
            "ine type (make) type"", type text}, {""Some text goes here. It is long and does not break Machine type (make) manufacturer"", type text}, {""Some text goes here. It is long and does not break"", type text}, {""Some text goes here. It is long and does not break Specification of machine model"", type text}, {""Some text goes here. It is long and does not break Specifi" & _
            "cation of machine engines"", type text}, {""Some text goes here. It is long and does not break Specification of machine seats"", type text}, {""Some text goes here. It is long and does not break Specification of machine speed"", type text}, {""Some text goes here. It is long and does not break Specification of machine engine"", type text}, {""2"", type text}, {""Oth" & _
            "er text goes here Other variables s1"", type text}, {""Other text goes here Other variables s2"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
        ActiveWorkbook.Worksheets.Add
        With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
            "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 0"";Extended Properties=""""" _
            , Destination:=Range("$A$1")).QueryTable
            .CommandType = xlCmdSql
            .CommandText = Array("SELECT * FROM [Table 0]")
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = True
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlInsertDeleteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = True
            .RefreshPeriod = 0
            .PreserveColumnInfo = True
            .ListObject.DisplayName = "Table_0"
            .Refresh BackgroundQuery:=False
        End With
    End Sub
    

    【讨论】:

    • 哇,Microsoft.Mashup.OleDb.1,我以前从未见过。一些链接或解释会很好。 mrexcel.com/forum/excel-questions/… 很高兴看到大量赏金涌现出伟大的新想法。
    • 谢谢。运行此程序可以正确获取各个单元格的格式。但是缺少标题、子标题、行组和缩进:(。我的目标是使表格看起来尽可能像 HTML 表格。我的原始代码接近于此,但某些单元格内容已格式化错了:(
    • @ashleedawg:这就是 Power Query 对吧? support.office.com/en-us/article/…
    • 我认为这需要Office 2016来编译。 stackoverflow.com/questions/48224836/…
    【解决方案8】:

    这适用于临时文件。

    它的作用: 本地下载数据。然后,用“\”替换“(”。然后,导入数据。将数据格式化为文本(以确保我们可以将其更改回没有错误)。然后,更改文本。这不能用 Range.Replace 完成因为这将重新格式化单元格内容。

    ' Local Variables
    Public FileName As String ' Temp File Path
    Public FileUrl As String ' Url Formatted Temp File Path
    Public DownloadUrl As String ' Where We're Going to Download From
    
    ' Declares Have to Be At Top
    Private Declare Function GetTempPath Lib "kernel32" _
      Alias "GetTempPathA" _
      (ByVal nBufferLength As Long, _
      ByVal lpBuffer As String) As Long
    Private Declare Function GetTempFileName Lib "kernel32" _
      Alias "GetTempFileNameA" _
      (ByVal lpszPath As String, _
      ByVal lpPrefixString As String, _
      ByVal wUnique As Long, _
      ByVal lpTempFileName As String) As Long
    
    ' Loads the HTML Content Without Bug
    Sub ImportHtml()
    
        ' Set Our Download URL
        DownloadUrl = "https://rasmusrhl.github.io/stuff"
    
        ' Sets the Temporary File Path
        SetFilePath
    
        ' Downloads the File
        DownloadFile
    
        ' Replaces the "(" in the File With "\(", We Will Later Put it Back
        ' This Ensures Formatting of Content Isn't Modified!!!
        ReplaceStringInFile
    
    
        ' Our Query Table is Now Coming From the Local File, Instead
        Dim s As QueryTable
        Set s = ActiveSheet.QueryTables.Add(Connection:=("FINDER;file://" + FileUrl), Destination:=Range("$A$1"))
    
        With s
    
            .Name = "stuff"
            .FieldNames = True
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = False
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlInsertDeleteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = True
            .RefreshPeriod = 0
            .WebSelectionType = xlEntirePage
            .WebFormatting = xlWebFormattingAll
            .WebPreFormattedTextToColumns = True
            .WebConsecutiveDelimitersAsOne = True
            .WebSingleBlockTextImport = False
            .WebDisableDateRecognition = True
            .WebDisableRedirections = False
            .Refresh BackgroundQuery:=False
    
            ' Sets Formatting So When We Change Text the Data Doesn't Change
            .ResultRange.NumberFormat = "@"
    
            ' Loop Through Cells in Range
            ' If You Do Excel Replace, Instead It Will Change Cell Format
            Const myStr As String = "\(", myReplace As String = "("
            For Each c In .ResultRange.Cells
                Do While c.Value Like "*" & myStr & "*"
                    c.Characters(InStr(1, c.Value, myStr), Len(myStr)).Text = myReplace
                Loop
            Next
    
    
        End With
    End Sub
    
    ' This function replaces the "(" in the file with "\("
    Sub ReplaceStringInFile()
    
        Dim sBuf As String
        Dim sTemp As String
        Dim iFileNum As Integer
        Dim sFileName As String
    
        ' Edit as needed
        sFileName = FileName
    
        iFileNum = FreeFile
        Open sFileName For Input As iFileNum
    
        Do Until EOF(iFileNum)
            Line Input #iFileNum, sBuf
            sTemp = sTemp & sBuf & vbCrLf
        Loop
        Close iFileNum
    
        sTemp = Replace(sTemp, "(", "\(")
    
        iFileNum = FreeFile
        Open sFileName For Output As iFileNum
        Print #iFileNum, sTemp
        Close iFileNum
    
    End Sub
    
    ' This function sets file paths because we need a temp file
    Function SetFilePath()
    
        If FileName = "" Then
            FileName = GetTempHtmlName
            FileUrl = Replace(FileName, "\", "/")
        End If
    
    End Function
    
    ' This subroutine downloads the file from the specified URL
    ' The download is necessary because we will be editing the file
    Sub DownloadFile()
    
        Dim myURL As String
        myURL = "https://rasmusrhl.github.io/stuff"
    
        Dim WinHttpReq As Object
        Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
        WinHttpReq.Open "GET", DownloadUrl, False, "username", "password"
        WinHttpReq.send
    
        myURL = WinHttpReq.responseBody
        If WinHttpReq.Status = 200 Then
            Set oStream = CreateObject("ADODB.Stream")
            oStream.Open
            oStream.Type = 1
            oStream.Write WinHttpReq.responseBody
            oStream.SaveToFile FileName, 2 ' 1 = no overwrite, 2 = overwrite
            oStream.Close
        End If
    
    End Sub
    
    '''''''''''''''''''''''''''''
    ' THIS BLOCK OF CODE GETS A TEMPORARY FILE PATH USING THE GetTempHtmlName Function
    '''''''''''''''''''''''''''''
    
    
    Public Function GetTempHtmlName( _
      Optional sPrefix As String = "VBA", _
      Optional sExtensao As String = "") As String
      Dim sTmpPath As String * 512
      Dim sTmpName As String * 576
      Dim nRet As Long
      Dim F As String
      nRet = GetTempPath(512, sTmpPath)
      If (nRet > 0 And nRet < 512) Then
        nRet = GetTempFileName(sTmpPath, sPrefix, 0, sTmpName)
        If nRet <> 0 Then F = Left$(sTmpName, InStr(sTmpName, vbNullChar) - 1)
        If sExtensao > "" Then
          Kill F
          If Right(F, 4) = ".tmp" Then F = Left(F, Len(F) - 4)
          F = F & sExtensao
        End If
        F = Replace(F, ".tmp", ".html")
        GetTempHtmlName = F
      End If
    End Function
    
    '''''''''''''''''''''''''''''
    ' End - GetTempHtmlName
    '''''''''''''''''''''''''''''
    

    【讨论】:

    • 我只需要更改订单。我已经重新格式化,但 VBA 需要变量,然后 pinvokes 然后函数/subs,所以它改变了我在 Excel 中的方式。
    • 谢谢,哇,我没想到会这么复杂 :) 无论如何,我得到Invalid procedure call or argument,当我点击debug 时,它指向行:WinHttpReq.Open "GET", DownloadUrl, False, "username", "password"。我对VBA一无所知,我做错了什么?
    • 它有效....您只需要确保启动 importhtml 宏...它将调用其余的...如果您独立执行其他宏之一...您将得到那个错误
    • 另外...您不能进入 VBA 并直接点击播放,因为有多个子例程...您必须显式调用 importhtml 一个
    • @rasmuslarsen 代码有效...如果我的 cmets 之外还有其他问题,请告诉我
    【解决方案9】:

    根据 Microsoft MSDN Library: WebFormatting Property 的文档,您可以尝试对代码进行以下更改:

     .WebFormatting = xlWebFormattingNone
    

    这可能允许在没有任何数字格式的情况下复制数据 - 然后您可以为这些单元格设置自己的数字格式(使用 MSDN: Excel VBA NumberFormat property

    类似的解决方案应该可以解决数字被截断或舍入的问题 - 为目标范围内受影响的单元格设置小数点...

    【讨论】:

    • 谢谢!不幸的是,它不起作用。设置 .WebFormatting = xlWebFormattingNone 仍然会将括号中的数字更改为负数。
    • @RasmusLarsen:检查显示为负数的单元格的数字格式... tr 更改为文本
    • 谢谢,但似乎不起作用:( 运行宏后,所有单元格的数字格式都是“常规”。如果我更改为“文本”,显示值和实际值仍然例如:“-98”。
    • @RasmusLarsen:您可以在get data from websites using Excel找到更多帮助
    猜你喜欢
    • 2012-07-03
    • 1970-01-01
    • 2015-08-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-12-31
    • 1970-01-01
    • 2011-08-30
    相关资源
    最近更新 更多