如何从电子邮件正文中复制特定文本?

时间:2019-01-14 08:40:05

标签: regex excel vba outlook-vba regex-group

Option Explicit

Sub GetFromInbox()

Dim olApp As Outlook.Application
Dim olNs As Outlook.Namespace
Dim olFldr As Outlook.MAPIFolder
Dim olItms As Outlook.Items
Dim olMail As Variant
Dim i As Long

Set olApp = New Outlook.Application
Set olNs = olApp.GetNamespace("MAPI")
Set olFldr = olNs.GetDefaultFolder(olFolderInbox).Folders("impMail")
Set olItms = olFldr.Items

olItms.Sort "Subject"

For Each olMail In olItms
    If InStr(olMail.Subject, "SubjectoftheEmail") > 0 Then
        ThisWorkbook.Sheets("Fixings").Cells(2, 2).Value = olMail.Body

    End If
Next olMail

Set olFldr = Nothing
Set olNs = Nothing
Set olApp = Nothing
End Sub

此代码可帮助我下载电子邮件的全文,但我需要在单元格中使用特定的粗体文本。电子邮件正文始终如下。线始终是相同的顺序。所有行始终存在。电子邮件中的所有名称都可以事先知道。

  

此电子邮件仅供内部使用

     

     

@ ABC4:请在系统中添加以下详细信息( 2019年1月12日):

     

12345_ABC_MakOpt --- 264532154.78
  12345_ABC_GAPFee --- 145626547.80

     

谢谢

´---------------------------------------------- -------         '设置         '------------------------------------------------- ----

    Dim wb As Workbook
    Dim rngEmailSubject As Range
    Dim rngInstrumentName As Range
    Dim rngDate As Range
    Dim rngAmount As Range
    Dim arrFixing() As typFixing
    Dim rngValue As Range

    Dim rowIdx As Integer
    Dim ix As Integer
    Dim fixingDate As Date

    With wb.Sheets("FixingFromEmail")

        Set rngInstrumentName = .Range("instrument.name")
        Set rngDate = .Range("Date")
        Set rngAmount = .Range("Amount")

        rowIdx = rngInstrumentName.Row
        ix = 0

        Do While True

            rowIdx = rowIdx + 1
             If Not IsEmpty(.Cells(rowIdx, rngInstrumentName.Column).Value) _
        Then

                ix = ix + 1

                ReDim Preserve arrFixing(1 To ix)
                arrFixing(ix).InstrumentName = .Cells(rowIdx, rngInstrumentName.Column).Value
                arrFixing(ix).Date = .Cells(rowIdx, rngDate.Column).Value
                arrFixing(ix).Amount = .Cells(rowIdx, rngAmount.Column).Value


            Else
                Exit Do
            End If

        Loop

    End With´

2 个答案:

答案 0 :(得分:0)

如果您总是在第一行中有一个日期,那么可以使用以下简单的方法来获取日期: [0-9] {2}-[A-Za-z] {3}-[0-9] {4}

在regex101上尝试一下,看看regex的各个部分做什么

对于另一部分,我猜最简单的方法是读取整行

答案 1 :(得分:0)

您的问题过于笼统,无法给出具体答案。我所能提供的只是第一阶段的一些指导。

您需要确定什么是固定的,什么是可变的。

“ @ ABC4”是否固定?是“ @ ABC4:请在系统中添加以下详细信息(用于)是否已固定?

是否总是有两条数据线?是否有多个数据线是示例?这些行的格式是:

Xxxxxxx space hyphen hyphen hyphen space amount 

我首先将文本主体分成几行。几乎可以肯定,回车换行会破坏行。要测试:

Dim Count As Long

For Each olMail In olItms

  Debug.Print Replace(Replace(Mid$(olMailBody, 1, 200), vbCr, "{c}"), vbLf, "{l}" & vbLf)
  Count = Count + 1
  If Count >= 10 Then
    Exit For
  End If

Next olMail

输出将类似于(最多)十个副本:

@ABC4: please add the following detail in system (for 12-Jan-2019):{c}{l}
{c}{l}
12345_ABC_MakOpt --- 264532154.78{c}{l}
12345_ABC_GAPFee --- 145626547.80{c}{l}
Are the characters between lines “{c}{l}” or “{l}” or something else?

在下面的代码中,如有必要,请替换vbCR & vbLf并运行它:

Dim Count As Long
Dim InxL As Long
Dim Lines() As String

For Each olMail In olItms

  Lines = Split(olMail.Body, vbCR & vbLf)
  For InxL = 0 to UBound(Lines)
    Debug.Print InxL + 1 & "  " & Lines(InxL)
  Next
  Count = Count + 1
  If Count >= 10 Then
    Exit For
  End If

Next

输出将类似于(最多)十个副本:

0  
1  @ABC4: please add the following detail in system (for 12-Jan-2019):
2  
3  12345_ABC_MakOpt --- 264532154.78
4  12345_ABC_GAPFee --- 145626547.80
5 

现在您可以看到文本行。注意:第一行是数字0。顶部永远不会有空行吗?顶部总是有空白行吗?会有所不同吗?我将假设顶部始终有一个空白行。如果该假设不正确,则下面的代码将需要修改。

如果第1行是“ xxxxxxxxxx日期):”,则可以提取日期,这样:

Dim DateCrnt As Date
Dim Pos As Long

DateCrnt = CDate(Left$(Right$(Lines(1), 13), 11))

Pos = InStr(1, Lines(1), "(for ")
DateCrnt = CDate(Mid$(Lines(1), Pos + 5, 11))

注意:这两种方法都取决于行尾,正如示例中所示。如果有任何变化,您将需要处理该变化的代码。

您现在可以使用以下代码拆分数据行:

Dim NameCrnt As String
Dim AmtCrnt As Double

For InxL = 3 To UBound(Lines)
  If Lines(InxL) <> "" Then
    Pos = InStr(1, Lines(InxL), " --- ")
    If Pos = 0 Then
      Debug.Assert False   ' Line not formatted as expected
    Else
      NameCrnt = Mid$(Lines(InxL), 1, Pos - 1)
      AmtCrnt = Mid$(Lines(InxL), Pos + 5)
    End If
    Debug.Print "Date="& DateCrnt & "    " & "Name=" & NameCrnt & "   " & "Amount=" & AmtCrnt
  End If
Next

输出为:

Date=12/01/2019    Name=12345_ABC_MakOpt   Amount=264532154.78
Date=12/01/2019    Name=12345_ABC_GAPFee   Amount=145626547.8

新部分显示了如何从电子邮件向工作表添加数据

这是本节的第二版,因为OP改变了对所需格式的想法。

下面的代码已经过测试,但使用的是我创建的伪造的电子邮件,看起来像是您所询问的电子邮件。因此可能需要一些调试。

我用以下标题创建了一个新的工作簿和一个名为“ Fixings”的新工作表:

Empty worksheet before macro run

处理完我的虚假电子邮件后,工作表如下:

Worksheet after runs to add data from three daily emails

行的顺序取决于找到电子邮件的顺序。您可能首先想要最新的。对工作表进行排序不在此答案的范围内。注意:列标题告诉宏要记录哪些值。如果在电子邮件中添加了新行,请添加新的列标题,并且将保存该值而无需更改宏。

除了一个例外,我将不解释我使用的VBA语句,因为可以很容易地在线搜索“ VBA xxxxx”并查找语句xxxxx的规范。例外是使用两个集合来保存未决数据。其余解释说明了我采用此方法的原因。

需求可能会有所变化,尽管可能不会持续六到十二个月。例如,经理将需要不同的标题或不同顺序的列。您无法预期将需要进行哪些更改,但可以为更改做准备。例如,在我的代码顶部,我有:

Const ColFixDate As Long = 1
Const ColFixDataFirst As Long = 2
Const RowFixHead As Long = 1
Const RowFixDataFirst As Long = 2

我本可以写Cells(Row, 1).Value = Date。这有两个缺点:(1)如果date列曾经被移动过,则您必须在代码中搜索访问它的语句,(2)您必须记住第1或2或3列中的内容,这使您的代码更难于理解。读。我避免将文字用于行号或列号。键入ColFixDataFirst(而不是2)的额外工作很快就会收回回报。

我注意到在添加到您的问题中的代码中,您使用命名范围来达到相同的效果。 VBA的问题在于,通常有几种方法可以达到相同的效果。我更喜欢常量,但我们每个人都必须选择自己喜欢的常量。

在一个部门工作过,该部门处理了许多来自外部的电子邮件和工作簿,其中包含有用的数据,我可以告诉您它们的格式一直在变化。将有一个额外的空白行,或者将删除现有的空白行。将会有额外的数据,或者现有数据将以不同的顺序进行。作者做出了他们认为会有所帮助的更改,但是很少做有用的事情,例如询问接收者是否愿意更改,甚至警告他们更改。我见过的最糟糕的情况是两个数字列被颠倒了,而且几个月都没有注意到。幸运的是,我没有参与其中,因为这是一场噩梦,要从我们的系统中删除有问题的数据,然后再导入正确的数据。我会检查所有我能想到的内容,并拒绝处理与我期望的不完全相同的电子邮件。错误消息都被写入立即窗口,这在开发过程中很方便。您可能要使用MsgBox或将其写入文件。如果电子邮件处理成功,则不会删除该电子邮件;将其移至子文件夹,以便在再次需要时可以对其进行检索。

olMail是一个Outlook常量。请勿将olMail或任何其他保留字用作变量名。

我使用Session而不是命名空间。它们应该是等效的,但我曾经遇到无法诊断的NameSpace问题,因此不再使用它们。

我不对电子邮件进行排序,因为您的代码没有利用对电子邮件进行排序的优势。也许您可以利用按ReceivedTime进行排序的优势,但我看到了难以避免的潜在问题。

我按相反的顺序处理电子邮件,因为它们是按位置访问的。如果将电子邮件5移动到另一个文件夹,则以前的电子邮件6现在就是电子邮件5,并且For循环将其跳过。如果以相反的顺序处理电子邮件,则您不介意电子邮件6现在是电子邮件5,因为您已经处理过该电子邮件。

如果您未设置包含日期或金额的单元格中的NumberFormat,则会根据Microsoft在您所在国家/地区的默认设置显示它们。我使用了我最喜欢的显示格式。更改为您的收藏夹。

在处理完整个电子邮件并提取所需的数据之前,该代码不会向工作表输出任何内容。这意味着必须存储来自早期数据行的数据,直到处理完所有行。我使用了两个CollectionsPendingNamesPendingAmts。这不是将数据存储在我为自己编写的宏中的方式。我的问题是替代方法更复杂,或者需要更高级的VBA。

回去问其他您不了解的问题。

Option Explicit
Sub GetFromInbox()

  Const ColFixDate As Long = 1
  Const ColFixName As Long = 2
  Const ColFixAmt As Long = 3
  Const RowFixDataFirst As Long = 2

  Dim AmtCrnt As Double
  Dim ColFixCrnt As Long
  Dim DateCrnt As Date
  Dim ErrorOnEmail As Boolean
  Dim Found As Boolean
  Dim InxItem As Long
  Dim InxLine As Long
  Dim InxPend As Long
  Dim Lines() As String
  Dim NameCrnt As String
  Dim olApp As New Outlook.Application
  Dim olFldrIn As Outlook.Folder
  Dim olFldrOut As Outlook.Folder
  Dim olMailCrnt As Outlook.MailItem
  Dim PendingAmts As Collection
  Dim PendingNames As Collection
  Dim Pos As Long
  Dim RowFixCrnt As Long
  Dim StateEmail As Long
  Dim TempStg As String
  Dim WshtFix As Worksheet

  Set WshtFix = ThisWorkbook.Worksheets("Fixings")
  With WshtFix
    RowFixCrnt = .Cells(Rows.Count, ColFixDate).End(xlUp).Row + 1
  End With

  Set olApp = New Outlook.Application
  Set olFldrIn = olApp.Session.GetDefaultFolder(olFolderInbox).Folders("impMail")
  Set olFldrOut = olFldrIn.Folders("Processed")

  For InxItem = olFldrIn.Items.Count To 1 Step -1

    If olFldrIn.Items(InxItem).Class = Outlook.olMail Then

      Set olMailCrnt = olFldrIn.Items(InxItem)

      If InStr(olMailCrnt.Subject, "SubjectoftheEmail") > 0 Then
        Lines = Split(olMailCrnt.Body, vbCr & vbLf)

        'For InxLine = 0 To UBound(Lines)
        '  Debug.Print InxLine + 1 & "  " & Lines(InxLine)
        'Next

        StateEmail = 0    ' Before "please add ..." line
        ErrorOnEmail = False
        Set PendingAmts = Nothing
        Set PendingNames = Nothing
        Set PendingAmts = New Collection
        Set PendingNames = New Collection

        For InxLine = 0 To UBound(Lines)
          NameCrnt = ""     ' Line is not a data line
          Lines(InxLine) = Trim(Lines(InxLine))  ' Remove any leading or trailing spaces

          ' Extract data from line
          If Lines(InxLine) <> "" Then
            If StateEmail = 0 Then
              If InStr(1, Lines(InxLine), "please add the ") = 0 Then
                Debug.Print "Error with email received " & olMailCrnt.ReceivedTime & vbLf & _
                            "  The first non-blank line is" & vbLf & _
                            "    " & Lines(InxLine) & vbLf & _
                            "  but I was expecting something like:" & vbLf & _
                            "    @ABC4: please add the following detail in system (for 13-Jan-2019):"
                ErrorOnEmail = True
                Exit For
              End If
              TempStg = Left$(Right$(Lines(InxLine), 13), 11)
              If Not IsDate(TempStg) Then
                Debug.Print "Error with email received " & olMailCrnt.ReceivedTime & vbLf & _
                            "  The value I extracted from the ""please add the ...""" & _
                            " line is """ & vbLf & "  " & TempStg & _
                            """ which I do not recognise as a date"
                ErrorOnEmail = True
                Exit For
              End If
              DateCrnt = CDate(TempStg)
              StateEmail = 1    ' After "please add ..." line
            ElseIf StateEmail = 1 Then
              If Lines(InxLine) = "" Then
                ' Ignore blank line
              ElseIf Lines(InxLine) = "thanks" Then
                ' No more data lines
                Exit For
              Else
                Pos = InStr(1, Lines(InxLine), " --- ")
                If Pos = 0 Then
                  Debug.Print "Error with email received " & olMailCrnt.ReceivedTime & vbLf & _
                              "  Data line: " & Lines(InxLine) & vbLf & _
                              "    does not contain ""---"" as required"
                  ErrorOnEmail = True
                  'Debug.Assert False
                  Exit For
                End If
                NameCrnt = Mid$(Lines(InxLine), 1, Pos - 1)
                TempStg = Mid$(Lines(InxLine), Pos + 5)
                If Not IsNumeric(TempStg) Then
                  Debug.Print "Error with email received " & olMailCrnt.ReceivedTime & vbLf & _
                              "  Data line:" & Lines(InxLine) & vbLf & _
                              "    value after ""---"" is not an amount"
                  ErrorOnEmail = True
                  'Debug.Assert False
                  Exit For
                End If
                AmtCrnt = CDbl(TempStg)
              End If
            End If  ' StateEmail
          End If ' Lines(InxLine) <> ""

          If ErrorOnEmail Then
            ' Ignore any remaining lines
            Exit For
          End If

          If NameCrnt <> "" Then
            ' Line was a data line without errors. Save until know entire email is error free
            PendingNames.Add NameCrnt
            PendingAmts.Add AmtCrnt
          End If

        Next InxLine

        If Not ErrorOnEmail Then
          ' Output pending rows now know entire email is error-free
          With WshtFix
            For InxPend = 1 To PendingNames.Count
              With .Cells(RowFixCrnt, ColFixDate)
                .Value = DateCrnt
                .NumberFormat = "d mmm yy"
              End With
              .Cells(RowFixCrnt, ColFixName).Value = PendingNames(InxPend)
              With .Cells(RowFixCrnt, ColFixAmt)
                .Value = PendingAmts(InxPend)
                .NumberFormat = "#,##0.00"
              End With
              RowFixCrnt = RowFixCrnt + 1
            Next
          End With
          ' Move fully processed email to folder Processed
          olMailCrnt.Move olFldrOut
        End If

      End If  ' InStr(olMailCrnt.Subject, "SubjectoftheEmail") > 0
    End If  ' olFldrIn.Items(InxItem).Class = Outlook.olMail

  Next InxItem

  Set olFldrIn = Nothing
  Set olFldrOut = Nothing
  olApp.Quit
  Set olApp = Nothing

End Sub