【问题标题】:How to find a certain phrase of doc/docx files from a folder by using powershell如何使用powershell从文件夹中查找特定短语的doc/docx文件
【发布时间】:2017-05-22 16:31:01
【问题描述】:

早上好,我是 powershell 的新手。

我一直在搜索如何在某个文件夹中的 .doc、.docx 中找到某个短语,但我找不到我正在寻找的确切解决方案。

例如,当我执行这段代码时

Get-ChildItem 'C:\Users\koshiasu\Desktop\adhocs' -Filter *.sql |Select-String -Pattern "children"

它在我的powershell底部显示如下

结果1

Desktop\adhocs\18722 Parents.sql:11:                     AND EXISTS (SELECT c.id_number FROM children c
Desktop\adhocs\18722 Parents.sql:38:                       AND EXISTS (SELECT c.id_number FROM children c
Desktop\adhocs\2969 ADHOC - Parents in Dallas.sql:11:                     AND EXISTS (SELECT c.id_number FROM children c
Desktop\adhocs\2969 ADHOC - Parents in Dallas.sql:92:                     AND EXISTS (SELECT c.id_number FROM children c

我想为 .doc、.docx 做同样的事情

所以我像这样更改了代码

Get-ChildItem 'C:\Users\koshiasu\Desktop\ADHOCS_WORD' -Filter *.doc, *.docx |Select-String -Pattern "Allocations"

但错误是

Get-ChildItem : Cannot convert 'System.Object[]' to the type 'System.String' required by parameter 'Filter'. Specified method is not supported.
At line:2 char:58
+ Get-ChildItem 'C:\Users\koshiasu\Desktop\ADHOCS_WORD' -Filter <<<<  *.doc, *.docx |Select-String -Pattern "Allocations"
    + CategoryInfo          : InvalidArgument: (:) [Get-ChildItem], ParameterBindingException
    + FullyQualifiedErrorId : CannotConvertArgument,Microsoft.PowerShell.Commands.GetChildItemCommand

我应该如何更改代码以显示结果1

非常感谢

【问题讨论】:

  • Get-ChildItem -Filter 只接受一种过滤模式,改为使用Get-ChildItem |Where-Object {'.doc','.docx' -contains $_.Extension} |Select-String ...。虽然它可能行不通。 .docx 文件基本上是 XML 文档的压缩文件夹,我不会尝试对它们使用正则表达式

标签: powershell ms-word


【解决方案1】:

我有一些基本相同的代码,稍微整理了一下,这样你就可以知道你在寻找什么:

$Path = "C:\Test"
$Find = "Allocations"
$WordExts = '.docx','.doc','.docm'

$Word = New-Object -ComObject Word.Application #create word obj
$Word.Visible = $false #hide the window

$ValidDocs = Get-ChildItem $Path | ? {$_.Extension -in $WordExts} | ForEach { #Foreach doc/docx/docm file in the above folder
    $Doc = $Word.Documents.Open($_.FullName) #Open the document in the word object
    $Content = $Doc.Content #get the 'content' object from the document
    $Content.MoveStart() | Out-Null #ensure we're searching from the beginning of the doc
                              #term,case sensitive,whole word,wildcard,soundslike,synonyms,direction,wrappingmode
    if ($Content.Find.Execute($Find,$false,        $true,     $false,  $false,    $false,  $true,    1)){ #execute a search
        Write-Host "$($_.Name) contains $($findText)" -ForegroundColor Green
        $_.FullName #store this in $ValidDocs
    } else {
        Write-Host "$($_.Name) does not contain $($findText)" -ForegroundColor Red
    }
    $Doc.Close() #close the individual document
    $Doc = $null #null it just in case
}

$Word.Quit() #quit the word process
$Word = $null #null it just in case

return $ValidDocs #return list of docs with the word in them

【讨论】:

  • 我们可以得到行号 - 找到 findtext 单词的位置
  • @user3657339 不确定 抱歉,Find.Execute 函数只会返回 true/false 我相信,如果您提出新问题,我相信有人会找到答案。
【解决方案2】:

这是另一种在 .docx 文件中搜索短语的解决方案。

$destination = 'c:\temp\'
$docs = Get-ChildItem -Path $source -Recurse | Where-Object {$_.Name -match 
'.docx'}
foreach ($doc in $docs)
{
    if 
($word.Documents.Open($doc.FullName).Content.Find.Execute('wordtosearchfor'))
    {
        Write-Host "$doc contains 'Test'"
        $docs | Out-File C:\temp\result.txt
    }
    else
    {
        $word.Application.ActiveDocument.Close()
    }
}

【讨论】:

    猜你喜欢
    • 2016-09-29
    • 2020-04-10
    • 1970-01-01
    • 2018-04-29
    • 2021-02-08
    • 2017-03-25
    • 2016-02-14
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多