使用VBA提取Word文档目录的标题和页码

时间:2010-09-02 14:42:36

标签: vba ms-word word-vba

基本上我们在这里有什么

Getting the headings from a Word document

Public Sub CreateOutline()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine

        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function

但我也需要每个标题的页码。

我尝试搜索每个标题,选择搜索结果并检索wdActiveEndPageNumber。

这不起作用,速度慢,确实是一种难看的方法。

我想将找到的内容粘贴到另一个word文档中,例如: rng.InsertAfter“Page:”& pageNum& “标题:”& strText& vbNewLine

3 个答案:

答案 0 :(得分:6)

我可能不理解这个问题,但是,这段代码遍历文档,寻找只有标题的行并将页面打开。

Public Sub SeeHeadingPageNumber()
    On Error GoTo MyErrorHandler

    Dim sourceDocument As Document
    Set sourceDocument = ActiveDocument

    Dim myPara As Paragraph
    For Each myPara In sourceDocument.Paragraphs
        myPara.Range.Select 'For debug only
        If InStr(LCase$(myPara.Range.Style.NameLocal), LCase$("heading")) > 0 Then
            Debug.Print myPara.Range.Information(wdActiveEndAdjustedPageNumber)
        End If

        DoEvents
    Next

    Exit Sub

MyErrorHandler:
    MsgBox "SeeHeadingPageNumber" & vbCrLf & vbCrLf & "Err = " & Err.Number & vbCrLf & "Description: " & Err.Description
End Sub

答案 1 :(得分:0)

尝试使用“目录”字段。以下代码剖析了TOC并为您提供了项目,页码和样式。您可能必须解析每个字符串以获取所需的确切信息或格式。

Public Sub SeeTOCInfo()
    On Error GoTo MyErrorHandler

    Dim sourceDocument As Document
    Set sourceDocument = ActiveDocument

    Dim myField As Field
    For Each myField In sourceDocument.TablesOfContents(1).Range.Fields
        Debug.Print Replace(myField.Result.Text, Chr(13), "-") & " " & " Type: " & myField.Type
        If Not myField.Result.Style Is Nothing Then
            Debug.Print myField.Result.Style
        End If
        DoEvents
    Next

    Exit Sub

MyErrorHandler:
    MsgBox "SeeTOCInfo" & vbCrLf & vbCrLf & "Err = " & Err.Number & vbCrLf & "Description: " & Err.Description
End Sub

答案 2 :(得分:0)

这将插入引用的标题的页码:

rng.InsertCrossReference ReferenceType:=wdRefTypeHeading, _
            ReferenceKind:=wdPageNumber, ReferenceItem:=intItem

但只有在您插入同一文档时才有效。您可以在当前文档中插入,然后剪切/粘贴到新文档。