检查docx文件中是否存在单词答案

【问题标题】：Check if word exists in docx file检查docx文件中是否存在单词
【发布时间】：2021-02-10 12:09:20
【问题描述】：

我的代码中加载了这个 docx 文件：

byte[] documentBytes = File.ReadAllBytes("C:\\mydocument.docx");

本文档在正文、页眉或页脚中包含单词“foo”，检查“foo”单词是否存在的最简单方法是什么？

【问题讨论】：

我在下面发布了我自己的答案，这就是为什么这是一个非常稀疏的问题。
虽然 SO 强烈鼓励用户自己回答问题，但这并不意味着该问题可能缺少任何相关信息。发布问题的规则就像您没有回答问题一样，因此请在您的问题中提供所有相关信息。
我总是努力改进自己的问题。你能给我一个关于在这种情况下如何做的提示吗？在这种情况下，发布代码似乎有点没用。
实际上您可以提供您的尝试以及它们是如何失败的。这是一个问题，记得吗？所以它不需要包含 answer。如果我没记错的话，你昨天的问题已经说明了一些问题。
我之前的问题也没有代码。但我会为这个问题添加一些额外的代码，让它看起来很漂亮。

标签： c# openxml-powertools

【解决方案1】：

使用OpenXML Powertools：

using OpenXmlPowerTools;

...

byte[] documentBytes = GetMyBytes(); // Load the docx file with File.ReadAllBytes, generate a byte array, etc
using var myStream = new MemoryStream(result, false);
using var myDocument = WordprocessingDocument.Open(myStream, false); // myStream can also be replaced with a path in string format

var regex = new Regex("foo");

int headerCount = OpenXmlRegex.Match(document.MainDocumentPart.HeaderParts.SelectMany(x => x.GetXDocument().Descendants(W.p)), regex);
int footerCount = OpenXmlRegex.Match(document.MainDocumentPart.FooterParts.SelectMany(x => x.GetXDocument().Descendants(W.p)), regex);
int bodyCount = OpenXmlRegex.Match(document.MainDocumentPart.GetXDocument().Descendants(W.p), regex);

变量headerCount、footerCount 和bodyCount 表示文档每个部分的正则表达式的命中数。 MainDocumentPart 属性还包含图像、图表、主题等属性。

【讨论】：

也许您想更新存储库链接，因为链接中的存储库不再维护。请改用github.com/EricWhiteDev/Open-Xml-PowerTools