【问题标题】:Error using OpenXML to read a .docx file from a memorystream to a WordprocessingDocument to a string and back使用 OpenXML 将 .docx 文件从内存流读取到 WordprocessingDocument 到字符串并返回时出错
【发布时间】:2020-02-25 18:53:40
【问题描述】:

我有一个现有的库,可用于接收 docx 文件并将其返回。该软件是托管在 Linux Docker 容器中的 .Net Core。

虽然范围非常有限,但我需要执行一些它无法执行的操作。由于这些很简单,我想我会使用 OpenXML,为了我的概念验证,我需要做的就是将 docx 作为内存流读取,替换一些文本,将其转回内存流并返回。

但是,返回的 docx 是不可读的。我已经注释掉了下面的文本替换以消除它,如果我注释掉对下面方法的调用,那么可以读取 docx,所以我确定问题出在这个方法中。

大概我在这里做了一些根本错误的事情,但是在谷歌搜索和玩弄代码几个小时后,我不知道如何纠正这个问题;有什么想法我有什么问题吗?

感谢您的帮助

private MemoryStream SearchAndReplace(MemoryStream mem)
{
    mem.Position = 0;

    using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(mem, true))
    {
        string docText = null;

        StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream());
        docText = sr.ReadToEnd();

        //Regex regexText = new Regex("Hello world!");
        //docText = regexText.Replace(docText, "Hi Everyone!");


        MemoryStream newMem = new MemoryStream();
        newMem.Position = 0;
        StreamWriter sw = new StreamWriter(newMem);
        sw.Write(docText);

        return newMem;
    }
}

【问题讨论】:

  • 您需要设置断点并调试才能使用xml可视化工具查看docText = regexText.Replace(docText, "Hi Everyone!");。检查Hello world!是否在xml文件的同一行。如果在同一行,您可以搜索可以,但是文件会损坏。如果不在同一行,则无法搜索。更多详细信息您可以参考:stackoverflow.com/a/6010906/11398810
  • 您的问题解决了吗?

标签: asp.net-core openxml docx memorystream


【解决方案1】:

如果您的真正要求是搜索和替换 WordprocessingDocument 中的文本,您应该查看 this answer

以下单元测试展示了如果用例确实要求您从部件中读取字符串、“按摩”字符串,然后将更改后的字符串写回部件,那么如何使您的方法发挥作用。它还显示了除上面已经提到的the answer 中描述的方法之外的任何其他方法的缺点之一,例如,通过证明如果字符串"Hello world!"w:r 元素之间拆分,将不会以这种方式找到。

[Fact]
public void CanSearchAndReplaceStringInOpenXmlPartAlthoughThisIsNotTheWayToSearchAndReplaceText()
{
    // Arrange.
    using var docxStream = new MemoryStream();
    using (var wordDocument = WordprocessingDocument.Create(docxStream, WordprocessingDocumentType.Document))
    {
        MainDocumentPart part = wordDocument.AddMainDocumentPart();
        var p1 = new Paragraph(
            new Run(
                new Text("Hello world!")));

        var p2 = new Paragraph(
            new Run(
                new Text("Hello ") { Space = SpaceProcessingModeValues.Preserve }),
            new Run(
                new Text("world!")));

        part.Document = new Document(new Body(p1, p2));

        Assert.Equal("Hello world!", p1.InnerText);
        Assert.Equal("Hello world!", p2.InnerText);
    }

    // Act.
    SearchAndReplace(docxStream);

    // Assert.
    using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(docxStream, false))
    {
        MainDocumentPart part = wordDocument.MainDocumentPart;
        Paragraph p1 = part.Document.Descendants<Paragraph>().First();
        Paragraph p2 = part.Document.Descendants<Paragraph>().Last();

        Assert.Equal("Hi Everyone!", p1.InnerText);
        Assert.Equal("Hello world!", p2.InnerText);
    }
}

private static void SearchAndReplace(MemoryStream docxStream)
{
    using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(docxStream, true))
    {
        // If you wanted to read the part's contents as text, this is how you
        // would do it.
        string partText = ReadPartText(wordDocument.MainDocumentPart);

        // Note that this is not the way in which you should search and replace
        // text in Open XML documents. The text might be split across multiple
        // w:r elements, so you would not find the text in that case.
        var regex = new Regex("Hello world!");
        partText = regex.Replace(partText, "Hi Everyone!");

        // If you wanted to write changed text back to the part, this is how
        // you would do it.
        WritePartText(wordDocument.MainDocumentPart, partText);
    }

    docxStream.Seek(0, SeekOrigin.Begin);
}

private static string ReadPartText(OpenXmlPart part)
{
    using Stream partStream = part.GetStream(FileMode.OpenOrCreate, FileAccess.Read);
    using var sr = new StreamReader(partStream);
    return sr.ReadToEnd();
}

private static void WritePartText(OpenXmlPart part, string text)
{
    using Stream partStream = part.GetStream(FileMode.Create, FileAccess.Write);
    using var sw = new StreamWriter(partStream);
    sw.Write(text);
}

【讨论】:

    猜你喜欢
    • 2012-05-13
    • 1970-01-01
    • 2023-04-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-02-10
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多