【发布时间】:2020-07-03 16:12:44
【问题描述】:
我正在编写一个 PowerShell 脚本来将 docx 转换为 HTML,并更改 HTML 的编码,因为默认情况下它将其保存为 windows-1252。
我需要这个,因为稍后我将这个 HTML 保存为电子邮件的正文,也由 PowerShell 发送。因为我是西班牙人,所以我需要显示口音和波浪号(现在显示为 ?)。
我尝试了带有所有参数的SaveAs 方法,但无法正常工作。
这是我的脚本:
$MSWord = New-Object -ComObject Word.Application
$MSWord.Documents.Open(“C:\Users\USER\Videos\CAMBIO_TURNO.docx”)
$MSWord.Visible = $false
# Save HTML
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], “wdFormatHTML”);
$path = “C:\Users\USER\Videos\CAMBIO_TURNO.html”
$MSWord.ActiveDocument.SaveAs([ref]$path, [ref]$saveFormat)
# Close File
$MSWord.ActiveDocument.Close()
$MSWord.Quit()
然后,为了将它发送给我,我在 PowerShell 上使用了其他代码:
$OutputEncoding = [System.Text.Encoding]::UTF8
$body = [IO.File]::ReadAllText(“C:\Users\USER\Videos\CAMBIO_TURNO.html”)
Send-MailMessage -To “EMAIL@EMAIL” -From “EMAIL@EMAIL” -Subject “CAMBIO” -Body $body -Encoding $OutputEncoding -BodyAsHtml -Attachments “C:\Users\USER\Videos\CAMBIO_TURNO.xlsx” -Dno onSuccess, onFailure -SmtpServer smtp.gmail.com -Credential EMAIL@EMAIL
第二次更新
(虽然我去了标记为重复的页面:Word Document.SaveAs ignores encoding, when calling through OLE, from Ruby or VBS它并没有解决我的问题。那个字配置不起作用)
以下是我使用网络选项将文档保存为 utf-8 后的尝试:
#DEFINE outputencoding FOR THE CONSOLE - IT SEEMS THAT IT DOESN'T WORK. I typed ñ and ó and they appear as ?? becasue it doesn't convert the hexadecimal values to the right charset
$OutputEncoding= New-Object -typename System.Text.ASCIIEncoding
# Open word to add input into the signature file
$MSWord = New-Object -ComObject word.application
$MSWord.Documents.Open('C:\Users\USER\Videos\CAMBIO_TURNO.docx')
# Save HTML
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], 'wdFormatFilteredHTML');
$path = 'C:\Users\USER\Videos\CAMBIO_TURNO.html'
$default = [Type]::Missing
$MSWord.ActiveDocument.SaveAs2([ref]$path, [ref]$saveFormat, [ref]$default, [ref]$default, [ref]$default, [ref]$default, [ref]$default, [ref]$default, [ref]$default, [ref]$default, [ref]$default, [ref]28591)
# Close File
$MSWord.ActiveDocument.Close()
$MSWord.Quit()
$HTMLw = Get-Content -Path 'C:\Users\USER\Videos\CAMBIO_TURNO.html' -Encoding ASCII -Force
$HTMLw -replace 'charset=windows-1252','charset=ISO-8859-1' | Set-Content -Path 'C:\Users\USER\Videos\CAMBIO_TURNO.html' -Encoding ASCII -Force
【问题讨论】:
标签: powershell ms-word