【问题标题】:How to generate a valid PDF/A file using iText and XMLWorker (HTML to PDF/A process)如何使用 iText 和 XMLWorker 生成有效的 PDF/A 文件(HTML 到 PDF/A 过程)
【发布时间】:2014-10-25 13:47:29
【问题描述】:

我目前正在开发一种可以接受 HTML 输入并将其转换为有效 PDF/A 文件的方法。我知道如何使用 iText(参考:http://itextsupport.com/download/pdfa3.html)以编程方式构建有效的 PDF/A 文件,但我无法使用 HTML 作为输入并使用 XMLWorker 将此输入转换为 PDF 文件来生成有效的 PDF/A 文件。我现在遇到的问题是由于 PDF/A 格式的嵌入字体要求。我总是得到这个例外:

线程“main”com.itextpdf.text.pdf.PdfAConformanceException 中的异常:必须嵌入所有字体。这不是:Helvetica

我尝试通过 CSS 文件强制 HTML 输入使用哪些字体,并通过 XMLWorkerFontProvider 类在输出 PDF 文件中注册要使用的字体,但似乎我做错了,因为注释了异常上面总是抛出。

为了让 XMLWorker 使用通过 XMLWorkerFontProvider 类注册的字体,我还需要什么?我想避免在输入中存在的每个 HTML 元素中使用默认字体 Helvetica。

下面是我用来测试的代码:

style.css(只有 1 行):

* { font: normal 100% Arial, sans-serif !important; }

Main.java:

package com.itextpdf;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.Reader;
import java.io.StringReader;

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.ICC_Profile;
import com.itextpdf.text.pdf.PdfAConformanceLevel;
import com.itextpdf.text.pdf.PdfAWriter;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerFontProvider;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.css.CssFile;
import com.itextpdf.tool.xml.css.StyleAttrCSSResolver;
import com.itextpdf.tool.xml.html.CssAppliers;
import com.itextpdf.tool.xml.html.CssAppliersImpl;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;

public class Main {

    /**
     * @param args
     */
    public static void main(String[] args) {

        StringBuffer buf = new StringBuffer();

        buf.append("<!DOCTYPE html>");
        buf.append("<html>");
        buf.append("<head>");
        buf.append("<title>Test</title>");
        buf.append("</head>");
        buf.append("<body>");
        buf.append("<p>This is a test</p>");
        buf.append("</body>");
        buf.append("</html>");

        OutputStream file = null;
        Document document = null;
        PdfAWriter writer = null;

        try {

            file = new FileOutputStream(new File("C:\\Users\\amartin\\Desktop\\Test.pdf"));
            document = new Document();
            writer = PdfAWriter.getInstance(document, file, PdfAConformanceLevel.PDF_A_1B);

            // Create XMP metadata. It's a PDF/A requirement.
            writer.createXmpMetadata();

            document.open();

            // Set output intent. PDF/A requirement.
            ICC_Profile icc = ICC_Profile.getInstance(new FileInputStream("./src/main/resources/com/itextpdf/sRGB Color Space Profile.icm"));
            writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);

            // CSS
            CSSResolver cssResolver = new StyleAttrCSSResolver();
            CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream("./css/style.css"));
            cssResolver.addCss(cssFile);

            XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
            fontProvider.register("./fonts/arial.ttf");
            fontProvider.register("./fonts/sans-serif.ttf");
            fontProvider.addFontSubstitute("lowagie", "garamond");

            CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
            HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
            htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());

            // Pipelines
            PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
            HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
            CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

            XMLWorker worker = new XMLWorker(css, true);
            XMLParser p = new XMLParser(worker);

            Reader reader = new StringReader(buf.toString());
            p.parse(reader);

        } catch (Exception e) {

            e.printStackTrace();

        } finally {

            if (document != null && document.isOpen())
                document.close();

            try {

                if (file != null)
                    file.close();

            } catch (IOException e) {}

            if (writer != null && !writer.isCloseStream())
                writer.close();

        }

    }

}

编辑:

作为 Bruno 的回答,我扩展了 FontFactoryImp 类,覆盖了 getFont() 方法(具有所有参数的方法)。它像这样调用 System.out.println 函数:

System.out.println("=fontname: " + fontname + " =encoding: " + encoding + " =embedded : " + embedded + " =size: " + size + " =style: " + style + " =BaseColor: " + color)

然后使用相同的参数调用 parent.getFont() 方法。我看到的唯一输出是:

=fontname: null =encoding: Cp1252 =embedded : true =size: -1.0 =style: -1 =BaseColor: null =fontname: null =encoding: Cp1252 =embedded : true =size: -1.0 =style: -1 =BaseColor: null

以及抛出的异常,粘贴在这段代码之前。

【问题讨论】:

  • 创建您自己的FontProvider 实现并将请求的字体写入System.out。确保所有这些字体都映射到字体程序。
  • 嗨,布鲁诺。我做了你评论的事情,我已经发布了编辑问题的结果。如果您想提供一些反馈或建议,我将不胜感激。
  • 输出意味着您在 CSS 中定义的字体没有被 XML Worker 拾取。现在我看了你的 CSS,我知道为什么了。尝试使用font-family: "Arial" 而不是使用font
  • 现在像魅力一样工作,布鲁诺。非常感谢您的帮助!您想回答这个问题,然后我可以关闭它,将您的答案标记为正确吗?再次,非常感谢你:-)
  • 能不能看一下这个问题:stackoverflow.com/q/52736441/3169868

标签: pdf fonts itext pdfa xmlworker


【解决方案1】:

根据您发送给System.out 的反馈,XML Worker 似乎没有选择您要使用的字体系列。

请像这样指定字体系列:

font-family: "Arial"

在 CSS 中使用“字体”可能会奏效,但很棘手。我认为 iText 看到 normal 并将其解释为 使用默认字体

【讨论】:

    【解决方案2】:

    使该示例运行的完整代码如下:

    style.css:

    * {
        font-family: "Arial";
        font-style: normal;
    }
    

    Main.java:

    package com.itextpdf;
    
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.OutputStream;
    import java.io.Reader;
    import java.io.StringReader;
    
    import com.itextpdf.text.Document;
    import com.itextpdf.text.pdf.ICC_Profile;
    import com.itextpdf.text.pdf.PdfAConformanceLevel;
    import com.itextpdf.text.pdf.PdfAWriter;
    import com.itextpdf.tool.xml.XMLWorker;
    import com.itextpdf.tool.xml.XMLWorkerHelper;
    import com.itextpdf.tool.xml.css.CssFile;
    import com.itextpdf.tool.xml.css.StyleAttrCSSResolver;
    import com.itextpdf.tool.xml.html.CssAppliers;
    import com.itextpdf.tool.xml.html.CssAppliersImpl;
    import com.itextpdf.tool.xml.html.Tags;
    import com.itextpdf.tool.xml.parser.XMLParser;
    import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
    import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
    import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
    import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
    import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;
    
    public class Main {
    
        public static void main(String[] args) {
    
            StringBuffer buf = new StringBuffer();
    
            String title = "Test";
    
            // Sample HTML content.
            buf.append("<!DOCTYPE html>");
            buf.append("<html>");
            buf.append("<head>");
            buf.append("<title>" + title + "</title>");
            buf.append("</head>");
            buf.append("<body>");
            buf.append("<p>This is a test</p>");
            buf.append("</body>");
            buf.append("</html>");
    
            OutputStream file = null;
            Document document = null;
            PdfAWriter writer = null;
    
            try {
    
                file = new FileOutputStream(new File("C:\\Users\\amartin\\Desktop\\Test.pdf"));
                document = new Document();
                writer = PdfAWriter.getInstance(document, file, PdfAConformanceLevel.PDF_A_1B);
    
                // Avoid discrepances between document title and XMP metadata information.
                document.addTitle(title);
    
                // Create XMP metadata. It's a PDF/A requirement.
                writer.createXmpMetadata();
    
                document.open();
    
                // Set output intent. PDF/A requirement.
                ICC_Profile icc = ICC_Profile.getInstance(new FileInputStream("./src/main/resources/com/itextpdf/sRGB Color Space Profile.icm"));
                writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
    
                // CSS stylesheet.
                CSSResolver cssResolver = new StyleAttrCSSResolver();
                CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream("./css/style.css"));
                cssResolver.addCss(cssFile);
    
                MyFontProvider fontProvider = new MyFontProvider();
                fontProvider.register("./fonts/arial.ttf");
    
                /* DEBUG
                System.out.println("Fonts present in " + fontProvider.getClass().getName());
                Set<String> registeredFonts = fontProvider.getRegisteredFonts();
                for (String font : registeredFonts)
                    System.out.println(font);
                */
    
                CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
                HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
                htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
    
                // Pipelines.
                PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
                HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
                CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
    
                XMLWorker worker = new XMLWorker(css, true);
                XMLParser p = new XMLParser(worker);
    
                Reader reader = new StringReader(buf.toString());
                p.parse(reader);
    
            } catch (Exception e) {
    
                e.printStackTrace();
    
            } finally {
    
                if (document != null && document.isOpen())
                    document.close();
    
                try {
    
                    if (file != null)
                        file.close();
    
                } catch (IOException e) {}
    
                if (writer != null && !writer.isCloseStream())
                    writer.close();
    
            }
    
        }
    
    }
    

    MyFontProvider.java:

    package com.itextpdf;
    
    import com.itextpdf.text.BaseColor;
    import com.itextpdf.text.Font;
    import com.itextpdf.text.FontFactoryImp;
    
    public class MyFontProvider extends FontFactoryImp {
    
        @Override
        public Font getFont(String fontname, String encoding, boolean embedded,
                float size, int style, BaseColor color) {
    
            System.out.println("=fontname: " + fontname + " =encoding: " + encoding + " =embedded : " + embedded + " =size: " + size + " =style: " + style + " =BaseColor: " + color);
    
            return super.getFont(fontname, encoding, embedded, size, style, color);
    
        }
    
    }
    

    再次感谢您,布鲁诺。很高兴在这里得到您的帮助:)

    【讨论】:

    • 我使用了您的代码,但收到错误消息All the fonts must be embedded. This one isn't: Helvetica。如何在 Windows 机器上解决此问题?我正在使用fontProvider.register("C:\\Windows\\Fonts\\Arial.ttf");
    猜你喜欢
    • 2016-01-13
    • 2014-02-27
    • 2013-09-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-05-20
    • 1970-01-01
    相关资源
    最近更新 更多