【问题标题】:Removing a string that includes CRLF characters from body of e-mail从电子邮件正文中删除包含 CRLF 字符的字符串
【发布时间】:2020-01-12 09:11:33
【问题描述】:

我正在尝试从选定的传入 MS Outlook (2016) 电子邮件中删除一个字符串。

字符串是德语中的两个句子。我使用 Replace() 功能。这主要是有效的。 (请参阅下面的完整程序。)

这两个句子有时由 CRLF(回车,换行)字符分隔,并且它们并不总是在同一个地方。这似乎是这些电子邮件在到达我的 Outlook 收件箱之前通过各种设备的结果。

首先解决问题的简单部分

在解决 CRLF 出现在不同位置的问题之前,我想创建一个处理固定位置带有 CRLF 的字符串的过程。

这样一个字符串的源代码是什么样子的:

(屏幕截图历史:我将电子邮件以 .html 格式保存在硬盘上,然后在 Notepad++ 中打开 .html 文件,以查看 CRLF 字符。)

html 标签与我无关。它们可以保留在电子邮件中。 (事实上​​,格式化标签也不同,所以最好不要开始处理它们。)我唯一关心的是删除可见部分,即文本“Diese E-Mail kommt...vertrauenswürdig halten”。

我试图通过将 CR LF 部分包含为 Chr() 来捕获带有换行符的文本:

strDelete01 = "Diese E-Mail kommt von Personen" & Chr(13) & Chr(10) & "außerhalb der Stadtverwaltung. Klicken Sie nur auf Links oder Dateianhnge," & Chr(13) & Chr(10) & "wenn Sie die Personenn für vertrauenswürdig halten." 

我的程序无法识别字符串,因此什么也不做。

到目前为止我的脚本

Public Sub EditBodyCgReplace()

'Declarations
   Dim obj As Object
   Dim Sel As Outlook.Selection
   Dim DoSave As Boolean
   Dim NewBody As String
   Dim strDelete01 As String
   Dim strDelete02 As String
   Dim strDelete03 As String
   Dim strDelete04 As String

'Fill the variables 
   strDelete01 = "Diese E-Mail kommt von Personen außerhalb der Stadtverwaltung. Klicken Sie nur auf Links oder Dateianhänge, wenn Sie die Personen für vertrauenswürdig halten."
   strDelete02 = "################################################################################"
   strDelete03 = <hr>
   strDelete04 = "Diese E-Mail kommt von Personen" & Chr(13) & Chr(10) & "außerhalb der Stadtverwaltung. Klicken Sie nur auf Links oder Dateianhnge," & Chr(13) & Chr(10) & "wenn Sie die Personenn für vertrauenswürdig halten."

'Note: I am playing here with various types of strings at once. For example, 
'the procedure will also remove <hr> lines and "#####" strings  

'Work with it 
    If TypeOf Application.ActiveWindow Is Outlook.Inspector Then
        Set obj = Application.ActiveInspector.CurrentItem
    Else
        Set Sel = Application.ActiveExplorer.Selection
        If Sel.Count Then
            Set obj = Sel(1)
            DoSave = True
        End If
    End If

    If Not obj Is Nothing Then
        NewBody = Replace(obj.HTMLBody, strDelete01, "")
        NewBody = Replace(obj.HTMLBody, strDelete02, "")
        NewBody = Replace(obj.HTMLBody, strDelete03, "")
        NewBody = Replace(obj.HTMLBody, strDelete04, "")

        If NewBody <> "" Then
            obj.HTMLBody = NewBody
            If DoSave Then
                obj.Save
            End If
        End If
    End If
    End Sub

问题:如何在搜索字符串中包含 CRLF?

后续问题:我可以做些什么来删除包含在不同位置中的 CRLF 的此类字符串?有没有办法使用正则表达式? Outlook中的VBA可以处理吗? - 想法:如果正则表达式有效,也许整个 CRLF 问题不再是问题,因为表达式看起来像

"Diese E-Mail kommt von * vertrauenswürdig halten."

因此在中间包含任何内容 - 包括 CRLF?

也许很重要

在做了各种实验后,我开始觉得 MS Outlook 在其电子邮件中根本不使用 HTML?

我发现我实际上无法处理 obj.HTMLBody 中的任何 html 代码。我可以处理纯文本。我无法处理 html 的某些部分,例如“


”并因此删除它,但我无法重现昨天的工作条件。)

我可以将电子邮件保存为 as html 文件(在 Outlook 之外,在我硬盘上的某个单独文件夹中),并且在这些文件中我确实看到了 CRLF和其他东西。但也许电子邮件,只要保存在 Outlook 本身中,是使用其他代码存储的?

那么这段代码是什么,我怎样才能删除其中的一部分?

【问题讨论】:

  • 我看到您已经注意到我经常使用的完整诊断工具。我现在已经添加了我的正​​则表达式解决方案。我从您的问题中了解到,CRLF 在不同的电子邮件中处于不同的位置。此解决方案将处理移动 CRLF,而不从电子邮件中删除任何其他 CRLF。

标签: vba outlook


【解决方案1】:

我的完整诊断程序

子程序InvestigateEmails() 将输出到即时窗口或桌面上的文件。立即窗口通常更方便,但有大约 200 行的限制。因此,如果输出可能超过 200 行,则必须输出到文件。如果输出少于 200 行,则由您选择。

要输出到即时窗口,请查看子例程 OutSomeProperties。添加您希望看到但缺少的任何属性。考虑删除当前不需要的任何属性。检查#Const Selected = True

要输出到文件,请查看子例程 OutAllProperties。更准确地说,这应该是“我知道并且曾经感兴趣的所有属性”。您可能希望检查是否包含您感兴趣的所有属性。我建议不要删除任何现有属性。检查#Const Selected = False

选择您希望查看其属性的电子邮件。运行子程序InvestigateEmails()

此代码使用条件编译,这会让不熟悉此技术的程序员感到困惑。要么研究条件编译,要么接受它正在做一些你不需要理解的有用的事情。

Option Explicit

' This code requires references to:
'    "Microsoft Scripting Runtime"
'   "Microsoft ActiveX Data Objects n.n Library".  Tested with version 6.1.

Public Sub InvestigateEmails()

  ' Outputs all or selected properties of one or more emails.

  ' ========================================================================
  ' "Selected = True" to output a small number of properties for
  ' a small number of emails to the Immediate Window.
  ' "Selected = False" to output all properties for any number of emails
  ' to desktop file "InvestigateEmails.txt".
  #Const Selected = True
  ' ========================================================================

  ' Technique for locating desktop from answer by Kyle:
  '                     http://stackoverflow.com/a/17551579/973283

  Dim Exp As Explorer
  Dim ItemCrnt As MailItem

  #If Not Selected Then
    Dim FileBody As String
    Dim Fso As FileSystemObject
    Dim Path As String

    Path = CreateObject("WScript.Shell").specialfolders("Desktop")
  #End If

  Set Exp = Outlook.Application.ActiveExplorer

  If Exp.Selection.Count = 0 Then
    Call MsgBox("Please select one or more emails then try again", vbOKOnly)
    Exit Sub
  Else
    For Each ItemCrnt In Exp.Selection
      If ItemCrnt.Class = olMail Then
        #If Selected Then
          Call OutSomeProperties(ItemCrnt)
        #Else
          Call OutAllProperties(ItemCrnt, FileBody)
        #End If
      End If
    Next
  End If

  #If Not Selected Then
    Call PutTextFileUtf8NoBom(Path & "\InvestigateEmails.txt", FileBody)
  #End If

End Sub
Public Sub OutSomeProperties(ItemCrnt As Outlook.MailItem)

  ' Outputs selected properties of a MailItem to the Immediate Window.

  ' The Immediate Window can only display about 200 rows before the older
  ' rows start scrolling off the top.  This means this routine is only
  ' suitable for displaying a small number of simple properties.  Add or
  ' remove properties as necessary to meet the current requirement.

  Dim InxA As Long
  Dim InxR As Long

  Debug.Print "=============================================="
  With ItemCrnt
    Debug.Print "  EntryId: " & .EntryID
    Debug.Print "  Created: " & .CreationTime
    Debug.Print " Receiver: " & .ReceivedByName
    Debug.Print " Received: " & .ReceivedTime
    For InxR = 1 To .Recipients.Count
      Debug.Print "Recipient: " & .Recipients(InxR)
    Next
    Debug.Print "   Sender: " & .Sender
    Debug.Print " SenderEA: " & .SenderEmailAddress
    Debug.Print " SenderNm: " & .SenderName
    Debug.Print "   SentOn: " & .SentOn
    Debug.Print "  Subject: " & .Subject
    Debug.Print "       To: " & .To
    If .Attachments.Count > 0 Then
      Debug.Print "Attachments:"
      For InxA = 1 To .Attachments.Count
        Debug.Print "    " & InxA & ": " & .Attachments(InxA).DisplayName
      Next
    End If
  End With

End Sub
Sub OutAllProperties(ItemCrnt As Outlook.MailItem, ByRef FileBody As String)

  ' Adds all properties of a MailItem to FileBody.

  ' The phrase "all properties" should more correctly be "all properties
  ' that I know of and have ever been interested in".

  ' Source of PropertyAccessor information:
  '   https://www.slipstick.com/developer/read-mapi-properties-exposed-outlooks-object-model/

  Dim InxA As Long
  Dim InxR As Long
  Dim PropAccess As Outlook.propertyAccessor

  If FileBody <> "" Then
    FileBody = FileBody & String(80, "=") & vbLf
  End If

  With ItemCrnt
    FileBody = FileBody & "EntryId: " & .EntryID
    FileBody = FileBody & "From (Sender): " & .Sender
    FileBody = FileBody & vbLf & "From (Sender name): " & .SenderName
    FileBody = FileBody & vbLf & "From (Sender email address): " & _
                                                     .SenderEmailAddress
    FileBody = FileBody & vbLf & "Subject: " & CStr(.Subject)
    FileBody = FileBody & vbLf & "Received: " & Format(.ReceivedTime, "dmmmyy hh:mm:ss")
    FileBody = FileBody & vbLf & "To: " & .To
    FileBody = FileBody & vbLf & "CC: " & .CC
    FileBody = FileBody & vbLf & "BCC: " & .BCC
    If .Attachments.Count = 0 Then
      FileBody = FileBody & vbLf & "No attachments"
    Else
      FileBody = FileBody & vbLf & "Attachments:"
      FileBody = FileBody & vbLf & "No.|Type|Path|Filename|DisplayName|"
      For InxR = 1 To .Recipients.Count
        FileBody = FileBody & vbLf & "Recipient" & InxR & ": " & .Recipients(InxR)
      Next
      For InxA = 1 To .Attachments.Count
        With .Attachments(InxA)
          FileBody = FileBody & vbLf & InxA & "|"
          Select Case .Type
            Case olByValue
              FileBody = FileBody & "Val"
            Case olEmbeddeditem
              FileBody = FileBody & "Ebd"
            Case olByReference
              FileBody = FileBody & "Ref"
            Case olOLE
              FileBody = FileBody & "OLE"
            Case Else
              FileBody = FileBody & "Unk"
          End Select
          ' Not all types have all properties.  This code handles
          ' those missing properties of which I am aware.  However,
          ' I have never found an attachment of type Reference or OLE.
          ' Additional code may be required for them.
          Select Case .Type
            Case olEmbeddeditem
              FileBody = FileBody & "|"
            Case Else
              FileBody = FileBody & "|" & .Pathname
          End Select
          FileBody = FileBody & "|" & .FileName
          FileBody = FileBody & "|" & .DisplayName & "|"
        End With
      Next
    End If  ' .Attachments.Count = 0
    Call OutLongTextRtn(FileBody, "Text: ", .Body)
    Call OutLongTextRtn(FileBody, "Html: ", .HtmlBody)

    Set PropAccess = .propertyAccessor

    FileBody = FileBody & vbLf & "PR_RECEIVED_BY_NAME: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0040001E")
    FileBody = FileBody & vbLf & "PR_SENT_REPRESENTING_NAME: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0042001E")
    FileBody = FileBody & vbLf & "PR_REPLY_RECIPIENT_NAMES: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0050001E")
    FileBody = FileBody & vbLf & "PR_SENT_REPRESENTING_EMAIL_ADDRESS: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0065001E")
    FileBody = FileBody & vbLf & "PR_RECEIVED_BY_EMAIL_ADDRESS: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0076001E")
    FileBody = FileBody & vbLf & "PR_TRANSPORT_MESSAGE_HEADERS:" & vbLf & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001E")
    FileBody = FileBody & vbLf & "PR_SENDER_NAME: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0C1A001E")
    FileBody = FileBody & vbLf & "PR_SENDER_EMAIL_ADDRESS: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0C1F001E")
    FileBody = FileBody & vbLf & "PR_DISPLAY_BCC: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0E02001E")
    FileBody = FileBody & vbLf & "PR_DISPLAY_CC: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0E03001E")
    FileBody = FileBody & vbLf & "PR_DISPLAY_TO: " & _
                           PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0E04001E")
    FileBody = FileBody & vbLf

    Set PropAccess = Nothing

  End With

End Sub
Sub OutLongTextRtn(ByRef TextOut As String, ByVal Head As String, _
                          ByVal TextIn As String)

  ' * Break TextIn into lines of not more than 100 characters
  '   and append to TextOut.
  ' * The output is arranged so:
  '     xxxx|sssssssssssssss|
  '         |sssssssssssssss|
  '         |ssssssssss|
  '   where "xxxx" is the value of Head and "ssss..." are characters from
  '         TextIn.  The third line in the example could be shorter because:
  '           * it contains the last few characters of TextIn
  '           * there a linefeed in TextIn
  '           * a <xxx> string recording whitespace would have been split
  '             across two lines.

  If TextIn = "" Then
    ' Nothing to do
    Exit Sub
  End If

  Const LenLineMax As Long = 100

  Dim PosBrktEnd As Long     ' Last > before PosEnd
  Dim PosBrktStart As Long   ' Last < before PosEnd
  Dim PosNext As Long        ' Start of block to be output after current block
  Dim PosStart As Long       ' First character of TextIn not yet output

  TextIn = TidyTextForDspl(TextIn)
  TextIn = Replace(TextIn, "lf›", "lf›" & vbLf)

  PosStart = 1
  Do While True
    PosNext = InStr(PosStart, TextIn, vbLf)
    If PosNext = 0 Then
      ' No LF in [Remaining] TextIn
      'Debug.Assert False
      PosNext = Len(TextIn) + 1
    End If
    If PosNext - PosStart > LenLineMax Then
      PosNext = PosStart + LenLineMax
    End If
    ' Check for <xxx> being split across lines
    PosBrktStart = InStrRev(TextIn, "‹", PosNext - 1)
    PosBrktEnd = InStrRev(TextIn, "›", PosNext - 1)
    If PosBrktStart < PosStart And PosBrktEnd < PosStart Then
      ' No <xxx> within text to be displayed
      ' No change to PosNext
      'Debug.Assert False
    ElseIf PosBrktStart > 0 And PosBrktEnd > 0 And PosBrktEnd > PosBrktStart Then
      ' Last or only <xxx> totally within text to be displayed
      ' No change to PosNext
      'Debug.Assert False
    ElseIf PosBrktStart > 0 And _
           (PosBrktEnd = 0 Or (PosBrktEnd > 0 And PosBrktEnd < PosBrktStart)) Then
      ' Last or only <xxx> will be split across rows
      'Debug.Assert False
      PosNext = PosBrktStart
    Else
      ' Are there other combinations?
      Debug.Assert False
    End If

    'Debug.Assert Right$(Mid$(TextIn, PosStart, PosNext - PosStart), 1) <> "‹"

    If TextOut <> "" Then
      TextOut = TextOut & vbLf
    End If
    If PosStart = 1 Then
      TextOut = TextOut & Head & "|"
    Else
      TextOut = TextOut & Space(Len(Head)) & "|"
    End If
    TextOut = TextOut & Mid$(TextIn, PosStart, PosNext - PosStart) & "|"
    PosStart = PosNext
    If Mid$(TextIn, PosStart, 1) = vbLf Then
      PosStart = PosStart + 1
    End If
    If PosStart > Len(TextIn) Then
      Exit Do
    End If
  Loop

End Sub
Sub PutTextFileUtf8NoBom(ByVal PathFileName As String, ByVal FileBody As String)

  ' Outputs FileBody as a text file named PathFileName using
  ' UTF-8 encoding without leading BOM

  '  1Nov16  Copied from http://stackoverflow.com/a/4461250/973283
  '          but replaced literals with parameters.
  ' 15Aug17  Discovered routine was adding an LF to the end of the file.
  '          Added code to discard that LF.
  ' 11Oct17  Posted to StackOverflow
  '  9Aug18  Comment from rellampec suggested removal of adWriteLine from
  '          WriteTest statement would avoid adding LF.
  ' 30Sep18  Amended routine to remove adWriteLine from WriteTest statement
  '          and code to remove LF from file. Successfully tested new version.

  ' References: http://stackoverflow.com/a/4461250/973283
  '             https://www.w3schools.com/asp/ado_ref_stream.asp

  Dim BinaryStream As Object
  Dim UTFStream As Object

  Set UTFStream = CreateObject("adodb.stream")

  UTFStream.Type = adTypeText
  UTFStream.Mode = adModeReadWrite
  UTFStream.Charset = "UTF-8"
  UTFStream.Open
  UTFStream.WriteText FileBody

  UTFStream.Position = 3 'skip BOM

  Set BinaryStream = CreateObject("adodb.stream")
  BinaryStream.Type = adTypeBinary
  BinaryStream.Mode = adModeReadWrite
  BinaryStream.Open

  UTFStream.CopyTo BinaryStream

  UTFStream.Flush
  UTFStream.Close
  Set UTFStream = Nothing

  BinaryStream.SaveToFile PathFileName, adSaveCreateOverWrite
  BinaryStream.Flush
  BinaryStream.Close
  Set BinaryStream = Nothing

End Sub

【讨论】:

  • 今天下午我终于有机会测试你的工具的截止版本。它完美无缺。因此,我也理解了它的内置限制,即在即时窗口中最多只能显示 200 行输出。我立刻想到:好吧,然后将输出保存到文件中。回到这里,我现在发现你的完整例程就是为了做到这一点,甚至更多。非常感谢!这是对我开发宏的小项目的难以置信的支持,以减轻电子邮件安全偏执狂导致我们市政 IT 部门陷入的荒谬。 [续]
  • [续。从上面] 我的宏已经完成了我想要它做的基本事情(在收到的电子邮件中删除某些重复的东西),但是在你的分析工具的帮助下,我将能够大大改进它。再次感谢您!
  • 工具测试:对我来说完美无缺。电子邮件的 HTML 内容被放入“桌面”文件夹中的 .txt 文件中。 (在 Notepad++ 等编辑器中打开此文件时,我们会看到 HMTL 以及添加的换行符,以提高人眼的可读性,如工具中所述)。很棒的工具!
【解决方案2】:

有空我会分部分回答你的问题。其他人可能会在我之前找到重要的部分。

我已经编辑了您的问题。我有几句话没看懂,所以我查看了源代码,发现我的怀疑是正确的,您包含的字符少于字符。 Stack Overflow 允许有限数量的 Html 标签。任何其他看起来像 Html 标记的东西都会被忽略。我用“<”替换了每个“

你有:

NewBody = Replace(obj.HTMLBody, strDelete01, "")
NewBody = Replace(obj.HTMLBody, strDelete02, "")
NewBody = Replace(obj.HTMLBody, strDelete03, "")
NewBody = Replace(obj.HTMLBody, strDelete04, "")
If NewBody <> "" Then

每个Replace(第一个除外)都会覆盖由前一个Replace 创建的NewBody 的值。你似乎认为如果没有找到strDelete04,NewBody 就会为空。不,如果找不到strDelete04NewBody 将是obj.HTMLBody 的副本。

你需要这样的东西:

NewBody = Replace(obj.HTMLBody, strDelete01, "")
NewBody = Replace(NewBody, strDelete02, "")
NewBody = Replace(NewBody, strDelete03, "")
NewBody = Replace(NewBody, strDelete04, "")
If NewBody <> obj.HTMLBody Then
  ' One or more delete strings found and removed

您说 CRLF 不在固定位置。如果是这样,对您的代码的任何简单修改都不会产生您想要的效果。我将向您展示如何实现您所寻求的效果,但首先我必须创建一些包含您的文本的电子邮件,以便我可以测试我的代码。

第 2 部分

在仔细查看了您的 Html 图像后,我相信有一个简单的解决方案。文本中的两个 CRLF 替换空格。如果总是这样,您可以使用:

NewBody = Replace(obj.HTMLBody, vbCr & vbLf, " ")

这将删除任何出现在 Html 中的任何 CRLF。是否存在额外的 CRLF 无关紧要,因为在显示文档时,Html 文档中的任何空白字符(包括 CR 和 LF)字符串都会被单个空格替换。

您完成删除不需要的文本:

Dim strDelete = "Diese E-Mail kommt von Personen außerhalb " & _
                "der Stadtverwaltung. Klicken Sie nur auf " & _
                "Links oder Dateianhänge, wenn Sie die Personen " & _
                "für vertrauenswürdig halten."

NewBody = Replace(NewBody, strDelete, "")

如果上述方法不起作用,您需要更方便的诊断技术。将整个电子邮件保存为 Html 可能很容易,但您无法确定结果与 VBA 宏所看到的有何不同。您想知道 Outlook 是否以 Html 以外的格式存储电子邮件。我无法想象为什么 Outlook 会将传入的 SMTP 邮件转换为某种机密格式,然后在用户希望查看时将其转换回来。如果 Outlook 确实有一个秘密格式,它对 VBA 程序员是完全隐藏的。

以下是我使用的诊断工具的简单版本。如果您需要更高级的东西,我可以提供,但让我们先尝试一下。

将下面的代码复制到 Outlook 模块。选择其中一封电子邮件,然后运行宏 DsplHtmlBodyFromSelectedEmails。电子邮件的整个 Html 正文将以可读的格式输出到即时窗口。我相信我已经包含了宏调用的所有子例程。如果我没有,我会提前道歉。如果您收到有关未定义例程的消息,请告诉我,我会将其添加到答案中。

Sub DsplHtmlBodyFromSelectedEmails()

  ' Select one or emails then run this macro.  For each selected email, the Received Time, the Subject and the Html body are output to the Immediate Window.  Note: the Immediate Window can only display about 200 lines before
The older lines are lost.

  Dim Exp As Explorer
  Dim Html As String
  Dim ItemCrnt As MailItem

  Set Exp = Outlook.Application.ActiveExplorer

  If Exp.Selection.Count = 0 Then
    Call MsgBox("Please select one or more emails then try again", vbOKOnly)
    Exit Sub
  Else
    For Each ItemCrnt In Exp.Selection
      With ItemCrnt
        If .Class = olMail Then
          Debug.Print .ReceivedTime & " " & .Subject
          Call OutLongTextRtn(Html, "Html", .HtmlBody)
          Debug.Print Html
        End If
      End With
    Next
  End If

End Sub
Sub OutLongTextRtn(ByRef TextOut As String, ByVal Head As String, _
                          ByVal TextIn As String)

  ' * Break TextIn into lines of not more than 100 characters
  '   and append to TextOut.
  ' * The output is arranged so:
  '     xxxx|sssssssssssssss|
  '         |sssssssssssssss|
  '         |ssssssssss|
  '   where "xxxx" is the value of Head and "ssss..." are characters from
  '         TextIn.  The third line in the example could be shorter because:
  '           * it contains the last few characters of TextIn
  '           * there a linefeed in TextIn
  '           * a <xxx> string recording whitespace would have been split
  '             across two lines.

  If TextIn = "" Then
    ' Nothing to do
    Exit Sub
  End If

  Const LenLineMax As Long = 100

  Dim PosBrktEnd As Long     ' Last > before PosEnd
  Dim PosBrktStart As Long   ' Last < before PosEnd
  Dim PosNext As Long        ' Start of block to be output after current block
  Dim PosStart As Long       ' First character of TextIn not yet output

  TextIn = TidyTextForDspl(TextIn)
  TextIn = Replace(TextIn, "lf›", "lf›" & vbLf)

  PosStart = 1
  Do While True
    PosNext = InStr(PosStart, TextIn, vbLf)
    If PosNext = 0 Then
      ' No LF in [Remaining] TextIn
      'Debug.Assert False
      PosNext = Len(TextIn) + 1
    End If
    If PosNext - PosStart > LenLineMax Then
      PosNext = PosStart + LenLineMax
    End If
    ' Check for <xxx> being split across lines
    PosBrktStart = InStrRev(TextIn, "‹", PosNext - 1)
    PosBrktEnd = InStrRev(TextIn, "›", PosNext - 1)
    If PosBrktStart < PosStart And PosBrktEnd < PosStart Then
      ' No <xxx> within text to be displayed
      ' No change to PosNext
      'Debug.Assert False
    ElseIf PosBrktStart > 0 And PosBrktEnd > 0 And PosBrktEnd > PosBrktStart Then
      ' Last or only <xxx> totally within text to be displayed
      ' No change to PosNext
      'Debug.Assert False
    ElseIf PosBrktStart > 0 And _
           (PosBrktEnd = 0 Or (PosBrktEnd > 0 And PosBrktEnd < PosBrktStart)) Then
      ' Last or only <xxx> will be split across rows
      'Debug.Assert False
      PosNext = PosBrktStart
    Else
      ' Are there other combinations?
      Debug.Assert False
    End If

    'Debug.Assert Right$(Mid$(TextIn, PosStart, PosNext - PosStart), 1) <> "‹"

    If TextOut <> "" Then
      TextOut = TextOut & vbLf
    End If
    If PosStart = 1 Then
      TextOut = TextOut & Head & "|"
    Else
      TextOut = TextOut & Space(Len(Head)) & "|"
    End If
    TextOut = TextOut & Mid$(TextIn, PosStart, PosNext - PosStart) & "|"
    PosStart = PosNext
    If Mid$(TextIn, PosStart, 1) = vbLf Then
      PosStart = PosStart + 1
    End If
    If PosStart > Len(TextIn) Then
      Exit Do
    End If
  Loop

End Sub
Function TidyTextForDspl(ByVal Text As String) As String

  ' Tidy Text for display by replacing white space with visible strings:
  '   Leave single space unchanged
  '   Replace single LF by                 ‹lf›
  '   Replace single CR by                 ‹cr›
  '   Replace single TB by                 ‹tb›
  '   Replace single non-break space by    ‹nbs›
  '   Replace single CRLF by               ‹crlf›
  '   Replace multiple spaces by           ‹n s›       where n is number of repeats
  '   Replace multiple LFs by              ‹n lf›      of white space character
  '   Replace multiple CRs by ‹cr› or      ‹n cr›
  '   Replace multiple TBs by              ‹n tb›
  '   Replace multiple non-break spaces by ‹n nbs›
  '   Replace multiple CRLFs by            ‹n crlf›

  Dim InsStr As String
  Dim InxWsChar As Long
  Dim NumWsChar As Long
  Dim PosWsChar As Long
  Dim RetnVal As String
  Dim WsCharCrnt As Variant
  Dim WsCharValue As Variant
  Dim WsCharDspl As Variant

  WsCharValue = VBA.Array(" ", vbCr & vbLf, vbLf, vbCr, vbTab, Chr(160))
  WsCharDspl = VBA.Array("s", "crlf", "lf", "cr", "tb", "nbs")

  RetnVal = Text

  ' Replace each whitespace individually
  For InxWsChar = 0 To UBound(WsCharValue)
    RetnVal = Replace(RetnVal, WsCharValue(InxWsChar), "‹" & WsCharDspl(InxWsChar) & "›")
  Next

  ' Look for repeats. If found replace <x> by <n x>
  For InxWsChar = 0 To UBound(WsCharValue)
    'Debug.Assert InxWsChar <> 1
    PosWsChar = 1
    Do While True
      InsStr = "‹" & WsCharDspl(InxWsChar) & "›"
      PosWsChar = InStr(PosWsChar, RetnVal, InsStr & InsStr)
      If PosWsChar = 0 Then
        ' No [more] repeats of this <x>
        Exit Do
      End If
      ' Have <x><x>.  Count number of extra <x>s
      NumWsChar = 2
      Do While Mid(RetnVal, PosWsChar + NumWsChar * Len(InsStr), Len(InsStr)) = InsStr
        NumWsChar = NumWsChar + 1
      Loop
      RetnVal = Mid(RetnVal, 1, PosWsChar - 1) & _
                "‹" & NumWsChar & " " & WsCharDspl(InxWsChar) & "›" & _
                Mid(RetnVal, PosWsChar + NumWsChar * Len(InsStr))
      PosWsChar = PosWsChar + Len(InsStr) + Len(NumWsChar)

    Loop
  Next

  ' Restore any single spaces
  RetnVal = Replace(RetnVal, "‹" & WsCharDspl(0) & "›", " ")

  TidyTextForDspl = RetnVal

End Function

【讨论】:

  • 谢谢!绝妙的答案! - 第 1 部分(“obj.HTMLBody”重复用于替换操作的错误):理解。我只是想得不够好。 - 第 2 部分(vbLf 和 vbCr):很好!我会试试这个。然而,替换电子邮件中的 all CRLF 可能会弄乱它们的重要部分,因此我将(稍后)必须找到一种方法来仅影响有问题的字符串。 - 您的诊断工具:这看起来很棒! - 我将在今天晚些时候测试所有内容,只要我有时间(今天会议太多)。 - 到目前为止,此评论只是告诉您我已阅读并感谢您的回答!
  • @ChristianGeiselmann Html 文档的作者可以包含尽可能多的 TB、CR 和 LF,以使文档对人类更具可读性。缩进嵌套标签很常见。一长串 TB、CR 和 LF 在显示时将变成一个空格,如果紧挨着块标签,则将被忽略。我不相信用空格替换文档中的每个 CRLF 都会对外观产生影响。但是,我想到了您可能更喜欢的正则表达式解决方案。我会在今天晚些时候发布。
  • 我们都犯了替换错误。你可能需要一些“Duh!”在它成为第二天性之前的片刻。在您的问题中,您提到了查找标签的问题,但没有解释您要做什么。如果这是一个单独的问题,我建议一个新问题。您可以提出的问题数量没有限制,小问题的回答比大问题更快。事实上,大问题经常被忽略或被否决并删除。不要忘记在有关标签的问题中避免使用小于号和 & 号。改用字符实体。
  • 亲爱的 Tony,这里是一份快速的临时报告:通过对 Replace() 函数的一项改进(使其作为链运行),我的生产环境中的过程现在已经完成了我想要的做!那太棒了! (live 过程的搜索字符串与上面的示例略有不同,并且它有一个额外的 Replace(NewBody, "
    以使
    消失任何格式包括在内。) - 但是,我将继续为此工作,特别是使用您的分析工具,并在此处再次报告。 - 因为我最初的问题已经解决了,所以我将问题标记为已解决。干杯。
  • PS:关于“小于”和&符号的课程:明白了,谢谢!好建议! - Chr(13) 与 vbCr 的问题:仍有待研究。实际上,我的电子邮件现在看起来很干净。然而,我将在今天晚些时候分析 HTML,看看 CRLF 是否被我当前使用的过程删除或保留。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2020-06-04
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2015-03-15
  • 2022-08-19
  • 2012-02-08
相关资源
最近更新 更多