【问题标题】:Getting the headings from a Word document从 Word 文档中获取标题
【发布时间】:2010-09-21 10:45:59
【问题描述】:

如何使用 VBA 获取 word 文档中所有标题的列表?

【问题讨论】:

    标签: vba ms-word


    【解决方案1】:

    您的意思是像这样的createOutline 函数(实际上将所有标题从源 Word 文档复制到新的 Word 文档中):

    (我相信 asrHeadings = _docSource.<strong>@987654322@</strong>(wdRefTypeHeading) 函数是这个程序的关键,应该可以让你检索到你想要的东西)

    Public Sub CreateOutline()
        Dim docOutline As Word.Document
        Dim docSource As Word.Document
        Dim rng As Word.Range
    
        Dim astrHeadings As Variant
        Dim strText As String
        Dim intLevel As Integer
        Dim intItem As Integer
    
        Set docSource = ActiveDocument
        Set docOutline = Documents.Add
    
        ' Content returns only the main body of the document, not the headers/footer.        
        Set rng = docOutline.Content
        ' GetCrossReferenceItems(wdRefTypeHeading) returns an array with references to all headings in the document
        astrHeadings = docSource.GetCrossReferenceItems(wdRefTypeHeading)
    
        For intItem = LBound(astrHeadings) To UBound(astrHeadings)
            ' Get the text and the level.
            strText = Trim$(astrHeadings(intItem))
            intLevel = GetLevel(CStr(astrHeadings(intItem)))
    
            ' Add the text to the document.
            rng.InsertAfter strText & vbNewLine
    
            ' Set the style of the selected range and
            ' then collapse the range for the next entry.
            rng.Style = "Heading " & intLevel
            rng.Collapse wdCollapseEnd
        Next intItem
    End Sub
    
    Private Function GetLevel(strItem As String) As Integer
        ' Return the heading level of a header from the
        ' array returned by Word.
    
        ' The number of leading spaces indicates the
        ' outline level (2 spaces per level: H1 has
        ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
    
        Dim strTemp As String
        Dim strOriginal As String
        Dim intDiff As Integer
    
        ' Get rid of all trailing spaces.
        strOriginal = RTrim$(strItem)
    
        ' Trim leading spaces, and then compare with
        ' the original.
        strTemp = LTrim$(strOriginal)
    
        ' Subtract to find the number of
        ' leading spaces in the original string.
        intDiff = Len(strOriginal) - Len(strTemp)
        GetLevel = (intDiff / 2) + 1
    End Function
    

    @kol 于 2018 年 3 月 6 日更新

    虽然astrHeadings 是一个数组(IsArray 返回 TrueTypeName 返回 String())当我尝试在 VBScript 中访问其元素时出现 type mismatch 错误(Windows 上的 v5.8.16384 10 临 1709 16299.248)。这一定是一个特定于 VBScript 的问题,因为如果我在 Word 的 VBA 编辑器中运行相同的代码,我就可以访问这些元素。我最终迭代了 TOC 的行,因为它甚至可以在 VBScript 中工作:

    For Each Paragraph In Doc.TablesOfContents(1).Range.Paragraphs
      WScript.Echo Paragraph.Range.Text
    Next
    

    【讨论】:

    • 除了将int 更改为long 以提高宏速度。
    • 按照@Wikis 的建议,我将函数的所有int 替换为long,但它给了我一个错误9“下标超出范围”。一些 int 可以被替换,但不是全部。 cf我发布的答案知道是哪一个。 (在 Word Pro 2013 中)
    • 使用这种方法注意截断的标题 (GetCrossReferenceItems)。 windowssecrets.com/forums/showthread.php/…
    • 虽然astrHeadings 是一个数组(IsArray 返回TrueTypeName 返回String())当我尝试获取它的元素时我得到一个type mismatch 错误(VBScript 5.8. 16384 在 Windows 10 专业版 1709 16299.248)。
    • @kol 9 年后,这是可能的。我当时没有在 Windows 10 中测试它;)
    【解决方案2】:

    获取标题列表最简单的方法是遍历文档中的段落,例如:

     Sub ReadPara()
    
        Dim DocPara As Paragraph
    
        For Each DocPara In ActiveDocument.Paragraphs
    
         If Left(DocPara.Range.Style, Len("Heading")) = "Heading" Then
    
           Debug.Print DocPara.Range.Text
    
         End If
    
        Next
    
    
    End Sub
    

    顺便说一句,我发现删除段落范围的最后一个字符是个好主意。否则,如果将字符串发送到消息框或文档,Word 会显示一个额外的控制字符。例如:

    Left(DocPara.Range.Text, len(DocPara.Range.Text)-1)
    

    【讨论】:

    • 比选定的答案更喜欢这个 - 它给了我更好的结果和更大的灵活性。
    • 我试过了,但是速度太慢了...花了大约 15 分钟的处理时间来循环我的文档(有很多表格,所以有超过 45000 段)
    【解决方案3】:

    这个宏非常适合我(Word 2010)。我稍微扩展了功能:现在它提示用户输入最低级别,并在该级别以下隐藏子标题。

    Public Sub CreateOutline()
    ' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
        Dim docOutline As Word.Document
        Dim docSource As Word.Document
        Dim rng As Word.Range
    
        Dim astrHeadings As Variant
        Dim strText As String
        Dim intLevel As Integer
        Dim intItem As Integer
        Dim minLevel As Integer
    
        Set docSource = ActiveDocument
        Set docOutline = Documents.Add
    
        minLevel = 1  'levels above this value won't be copied.
        minLevel = CInt(InputBox("This macro will generate a new document that contains only the headers from the existing document. What is the lowest level heading you want?", "2"))
    
        ' Content returns only the
        ' main body of the document, not
        ' the headers and footer.
        Set rng = docOutline.Content
        astrHeadings = _
         docSource.GetCrossReferenceItems(wdRefTypeHeading)
    
        For intItem = LBound(astrHeadings) To UBound(astrHeadings)
            ' Get the text and the level.
            strText = Trim$(astrHeadings(intItem))
            intLevel = GetLevel(CStr(astrHeadings(intItem)))
    
            If intLevel <= minLevel Then
    
                ' Add the text to the document.
                rng.InsertAfter strText & vbNewLine
    
                ' Set the style of the selected range and
                ' then collapse the range for the next entry.
                rng.Style = "Heading " & intLevel
                rng.Collapse wdCollapseEnd
            End If
        Next intItem
    End Sub
    
    Private Function GetLevel(strItem As String) As Integer
        ' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
        ' Return the heading level of a header from the
        ' array returned by Word.
    
        ' The number of leading spaces indicates the
        ' outline level (2 spaces per level: H1 has
        ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
    
        Dim strTemp As String
        Dim strOriginal As String
        Dim intDiff As Integer
    
        ' Get rid of all trailing spaces.
        strOriginal = RTrim$(strItem)
    
        ' Trim leading spaces, and then compare with
        ' the original.
        strTemp = LTrim$(strOriginal)
    
        ' Subtract to find the number of
        ' leading spaces in the original string.
        intDiff = Len(strOriginal) - Len(strTemp)
        GetLevel = (intDiff / 2) + 1
    End Function
    

    【讨论】:

      【解决方案4】:

      提取所有标题的最快方法(到 LEVEL5)。

      Sub EXTRACT_HDNGS()
      Dim WDApp As Word.Application    'WORD APP
      Dim WDDoc As Word.Document       'WORD DOC
      
      Set WDApp = Word.Application
      Set WDDoc = WDApp.ActiveDocument
      
      For Head_n = 1 To 5
      Head = ("Heading " & Head_n)
      WDApp.Selection.HomeKey wdStory, wdMove
      
          Do
             With WDApp.selection
            .MoveStart Unit:=wdLine, Count:=1    
            .Collapse Direction:=wdCollapseEnd
             End with
              With WDApp.Selection.Find
                .ClearFormatting:          .text = "":     
                .MatchWildcards = False:   .Forward = True
                .Style = WDDoc.Styles(Head)
               If .Execute = False Then GoTo Level_exit
                  .ClearFormatting
              End With
      
             Heading_txt = RemoveSpecialChar(WDApp.Selection.Range.text, 1):              Debug.Print Heading_txt
             Heading_lvl = WDApp.Selection.Range.ListFormat.ListLevelNumber:              Debug.Print Heading_lvl
             Heading_lne = WDDoc.Range(0, WDApp.Selection.Range.End).Paragraphs.Count:    Debug.Print Heading_lne
             Heading_pge = WDApp.Selection.Information(wdActiveEndPageNumber):            Debug.Print Heading_pge
      
             If Wdapp.Selection.Style = "Heading 1" Then GoTo Level_exit
             Wdapp.Selection.Collapse Direction:=wdCollapseStart
         Loop
      Level_exit:
      Next Head_n
      
      End Sub
      

      【讨论】:

        【解决方案5】:

        根据 Wikis 对 VonC 答案的评论,这里是对我有用的代码。它使函数更快。

        Public Sub CopyHeadingsInNewDoc()
            Dim docOutline As Word.Document
            Dim docSource As Word.Document
            Dim rng As Word.Range
        
            Dim astrHeadings As Variant
            Dim strText As String
            Dim longLevel As Integer
            Dim longItem As Integer
        
            Set docSource = ActiveDocument
            Set docOutline = Documents.Add
        
            ' Content returns only the
            ' main body of the document, not
            ' the headers and footer.
            Set rng = docOutline.Content
            astrHeadings = _
             docSource.GetCrossReferenceItems(wdRefTypeHeading)
        
            For intItem = LBound(astrHeadings) To UBound(astrHeadings)
                ' Get the text and the level.
                strText = Trim$(astrHeadings(intItem))
                intLevel = GetLevel(CStr(astrHeadings(intItem)))
        
                ' Add the text to the document.
                rng.InsertAfter strText & vbNewLine
        
                ' Set the style of the selected range and
                ' then collapse the range for the next entry.
                rng.Style = "Heading " & intLevel
                rng.Collapse wdCollapseEnd
            Next intItem
        End Sub
        
        Private Function GetLevel(strItem As String) As Integer
            ' Return the heading level of a header from the
            ' array returned by Word.
        
            ' The number of leading spaces indicates the
            ' outline level (2 spaces per level: H1 has
            ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
        
            Dim strTemp As String
            Dim strOriginal As String
            Dim longDiff As Integer
        
            ' Get rid of all trailing spaces.
            strOriginal = RTrim$(strItem)
        
            ' Trim leading spaces, and then compare with
            ' the original.
            strTemp = LTrim$(strOriginal)
        
            ' Subtract to find the number of
            ' leading spaces in the original string.
            longDiff = Len(strOriginal) - Len(strTemp)
            GetLevel = (longDiff / 2) + 1
        End Function
        

        【讨论】:

        • 我 6 岁以上的答案很有趣。 +1
        • 我本可以编辑您的答案,但由于您没有根据 Wiki 评论进行编辑,我不确定这是否是个好主意! (我还是 VBA 的新手)
        • @VonC 顺便说一句,有没有办法使用此功能仅选择标题 1 和 2(如果需要,您可以编辑我的答案以反映更改;-)!)
        【解决方案6】:

        为什么要重复这么多次?!?

        “所有标题的列表”只是文档的标准 Word 索引!

        这是我在向文档添加索引时录制宏得到的:

        Sub Macro1()
            ActiveDocument.TablesOfContents.Add Range:=Selection.Range, _
                RightAlignPageNumbers:=True, _
                UseHeadingStyles:=True, _
                UpperHeadingLevel:=1, _
                LowerHeadingLevel:=5, _
                IncludePageNumbers:=True, _
                AddedStyles:="", _
                UseHyperlinks:=True, _
                HidePageNumbersInWeb:=True, _
                UseOutlineLevels:=True
        End Sub
        

        【讨论】:

          【解决方案7】:

          您还可以在文档中创建目录并复制它。这将 para ref 从标题中分离出来,如果您需要在另一个上下文中呈现它,这很方便。 如果您不想在文档中使用 ToC,只需在 Copy n Paste 之后将其删除。 JK。

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 2014-02-23
            • 2013-07-06
            • 1970-01-01
            • 2012-01-12
            • 2011-06-17
            • 1970-01-01
            相关资源
            最近更新 更多