从Word提取标题到Excel

时间:2018-06-13 13:05:13

标签: excel vba excel-vba ms-word

我有一个包含评论的word文档。我写了一个脚本来提取到Excel:

  1. 评论编号
  2. 页码
  3. 评论者的第一个首字母
  4. 评论者的姓氏
  5. 评论的撰写日期
  6. 实际评论
  7. 我无法弄清楚的问题是我还需要提取标题编号和该标题的文本。我需要第7列用于评论所在的标题。例如,假设我在标题“4.1这是标题”下的部分中有评论。我需要提取标题号(4.1)和标题文本(这是一个标题)以及相关评论。

    要创建标题,我使用了样式下功能区主页选项卡上Word中的标题功能。

    这是我到目前为止所写的内容:

     Sub Export_Comments()
    
    ' Purpose: Search for comments in any text that's been pasted into
    ' this document, then export them into a new Excel spreadsheet.
    ' Requires reference to Microsoft Excel 15.0 Object Library in VBA,
    ' which should already be saved with as part of the structure of
    ' this .docm file.
    
    Dim bResponse As Integer
    
    ' Exit routine if no comments have been found.
    If ActiveDocument.Comments.Count = 0 Then
      MsgBox ("No comments found in this document")
      Exit Sub
    Else
      bResponse = MsgBox("Do you want to export all comments to an Excel worksheet?", _
                  vbYesNo, "Confirm Comment Export")
      If bResponse = 7 Then Exit Sub
    End If
    
    ' Create a object to hold the contents of the
    ' current document and its text. (Shorthand
    ' for the ActiveDocument object.
    Dim wDoc As Document
    Set wDoc = ActiveDocument
    
    ' Create objects to help open Excel and create
    ' a new workbook behind the scenes.
    Dim xlApp As Excel.Application
    Dim xlWB As Excel.Workbook
    
    Dim i As Integer
    Dim oComment As Comment         'Comment object
    
    Set xlApp = CreateObject("Excel.Application")
    xlApp.Visible = False
    
    ' Create a new Workbook. Shouldn't interfere with
    ' other Workbooks that are already open. Will have
    ' at least one worksheet by default.
    Set xlWB = xlApp.Workbooks.Add
    
    With xlWB.Worksheets(1).Range("A1")
    
      ' Create headers for the comment information
      .Offset(0, 0) = "Comment Number"
      .Offset(0, 1) = "Page Number"
      .Offset(0, 2) = "Reviewer Initials"
      .Offset(0, 3) = "Reviewer Name"
      .Offset(0, 4) = "Date Written"
      .Offset(0, 5) = "Comment Text"
    
      ' Export the actual comments information
      For i = 1 To wDoc.Comments.Count
    
        Set oComment = wDoc.Comments(i)
        .Offset(i, 0) = oComment.Index                                                'Comment Number
        .Offset(i, 1) = oComment.Reference.Information(wdActiveEndAdjustedPageNumber) 'Page Number
        .Offset(i, 2) = oComment.Initial                                              'Author Initials
        .Offset(i, 3) = oComment.Author                                               'Author Name
        .Offset(i, 4) = Format(oComment.Date, "mm/dd/yyyy")                           'Date of Comment
        .Offset(i, 5) = oComment.Range                                                'Actual Comment
      Next i
    
    End With
    
    ' Make the Excel workbook visible
    xlApp.Visible = True
    
    ' Clean up our objects
    Set oComment = Nothing
    Set xlWB = Nothing
    Set xlApp = Nothing
    End Sub
    

1 个答案:

答案 0 :(得分:3)

您可以使用名为\HeadingLevel的内置书签为特定位置获取标题(通过应用九种可能的标题样式之一来定义)。为此,选择需要在该范围内。这将返回标题下的整个文本,因此需要将其折叠到其起始点,然后代码将使用该段落来获取ListString(编号)和文本。

文档中评论的范围是Comment.Reference

在您的代码的基础上,以下工作在我的测试环境(Word)中:

Dim rngComment As Word.Range, rngHeading As Word.Range

Set rngComment = oComment.Reference
rngComment.Select
Set rngHeading = ActiveDocument.Bookmarks("\HeadingLevel").Range
rngHeading.Collapse wdCollapseStart
Set rngHeading = rngHeading.Paragraphs(1).Range
Debug.Print rngHeading.ListFormat.ListString & " " & rngHeading.Text

我无法复制您的环境,但以下情况应该有效

 For i = 1 To wDoc.Comments.Count
   Set oComment = wDoc.Comments(i)
   Set rngComment = oComment.Reference
   rngComment.Select
   Set rngHeading = wDoc.Bookmarks("\HeadingLevel").Range
   rngHeading.Collapse wdCollapseStart
   Set rngHeading = rngHeading.Paragraphs(1).Range
  .Offset(i, 0) = oComment.Index
  .Offset(i, 1) = oComment.Reference.Information(wdActiveEndAdjustedPageNumber)
  .Offset(i, 2) = oComment.Initial    
  .Offset(i, 3) = oComment.Author
  .Offset(i, 4) = Format(oComment.Date, "mm/dd/yyyy")
  .Offset(i, 5) = oComment.Range
  .Offset(i, 6) = rngHeading.ListFormat.ListString & " " & rngHeading.Text
Next i