通过VBA将多个Word表格内容提取到Excel - 表格内容编号为行

时间:2015-02-13 06:10:59

标签: vba excel-vba word-vba excel

我几年前在您的网站上发现了一些可爱的代码,它为我提供了导出到Excel的特定表格,行和列信息,并且效果很好。 (感谢原始海报)。

但是,我现在被要求抓住另一个表格单元格,而且这个单元格中有几个段落,它们已经自动编号为单元格中的列表(或另一个,子弹指向)。我并不总是知道列表中有多少项,但我需要完整的单元格内容。

我遇到的问题是,当数据通过编码导出到Excel时,它会丢失编号,并且回车,并且基本上所有都会一起运行而不会中断前一行的数据。

例如 -

  1. P& ID 111222
  2. DWG 111-5456
  3. DOC 512BC-1234
  4. 成为出口:
    P& ID 111222DWG 111-5456DOC512BC-1234

    有人可以建议如何调整代码以阻止数据一起运行吗?我很乐意将数据放在一个Excel单元格中,如果是这样的话,我会很高兴。

    提前致谢,温迪

      Sub wordScrape()
    
    Dim wrdDoc As Object
    Dim objFiles As Object
    Dim fso As Object
    Dim wordApp As Object
    Dim sh1 As Worksheet
    Dim x As Integer
    
    ' Change this to the folder containing your word documents
    FolderName = "Y:\120\TEST"
    
    Set sh1 = ThisWorkbook.Sheets(1)
    Set fso = CreateObject("Scripting.FileSystemObject")
    Set wordApp = CreateObject("Word.application")
    Set objFiles = fso.GetFolder(FolderName).Files
    
    x = 1
    For Each wd In objFiles
        If InStr(wd, ".docx") And InStr(wd, "~") = 0 Then
            Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True)
            
            'word document file name
            sh1.Cells(x, 1) = wd.Name
            
            'document number - Table 1, Row 2, Column 1
            sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=2, Column:=1).Range)
            
            'document title - Table 1, Row 3, Column 1
            sh1.Cells(x, 3) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=3, Column:=1).Range)
            
            'cell for tags for document - Table 1, Row 9, Column 2
            ' note - if more than 1 line, and automatic numbering in WORD doc, when exported, will remove numbering and line breaks - runs everything together
            sh1.Cells(x, 4) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=9, Column:=2).Range)
            
            'cell that notes frequency for doc - Table 1, Row 16, Column 2
            sh1.Cells(x, 5) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=16, Column:=2).Range)
            
                   
            'sh1.Cells(x, 3) = ....more extracted data....
            
            x = x + 1
        wrdDoc.Close
        End If
    
    Next wd
    wordApp.Quit
    End Sub

1 个答案:

答案 0 :(得分:2)

这将从Word表格单元格中提取常规或项目符号文本,格式化为在Excel单元格中使用。

它会添加"子弹"或者如果单词中的文本被格式化为列表,则为数字(但请注意,如果单元格具有混合格式,则数字将关闭)

'get the text from a table cell
Function CellContent(wdCell) As String
    Dim s As String, i As Long, pc As Long, p As Object
    pc = wdCell.Range.Paragraphs.Count
    'loop over paragraphs in cell (could just be 1)
    For i = 1 To pc
        s = s & IIf(i > 1, Chr(10), "") 'line break if not first para

        Set p = wdCell.Range.Paragraphs(i)
        'any list format applied ?
        Select Case p.Range.listformat.listtype
            Case 2: s = s & "* " 'bullet
            Case 3: s = s & i & ". " 'numbered
        End Select
        s = s & p.Range.Text
    Next i

    CellContent = Left(s, Len(s) - 1) 'trim off end-of-cell mark from Word
End Function

以下是您从当前Sub中调用它的方式:

sh1.Cells(x, 4) = CellContent( wrdDoc.Tables(1).Cell(9, 2) )