【问题标题】:How can I remove blank page from PDF in iText如何在 iText 中从 PDF 中删除空白页
【发布时间】:2011-01-28 16:20:54
【问题描述】:

我想从使用 Java 中的 iText 库生成的 PDF 中删除空白页。

我该怎么做?

【问题讨论】:

    标签: itext


    【解决方案1】:

    C#(根据 kalyan 的要求)

    public static void removeBlankPdfPages(string pdfSourceFile, string pdfDestinationFile, bool debug) {
    
    // step 0: set minimum page size
    const int blankPdfsize = 20;
    
    // step 1: create new reader
    var r = new PdfReader(pdfSourceFile);
    var raf = new RandomAccessFileOrArray(pdfSourceFile);
    var document = new Document(r.GetPageSizeWithRotation(1));
    
    // step 2: create a writer that listens to the document
    var writer = new PdfCopy(document, new FileStream(pdfDestinationFile, FileMode.Create));
    
    // step 3: we open the document
    document.Open();
    
    // step 4: we add content
    PdfImportedPage page = null;
    
    //loop through each page and if the bs is larger than 20 than we know it is not blank.
    //if it is less than 20 than we don't include that blank page.
    for (var i=1 ; i <= r.NumberOfPages; i++)
    {
        //get the page content
        byte[] bContent = r.GetPageContent(i, raf);
        var bs = new MemoryStream();
    
        //write the content to an output stream
        bs.Write(bContent, 0, bContent.Length);
        Console.WriteLine("page content length of page {0} = {1}", i, bs.Length);
    
        //add the page to the new pdf
        if (bs.Length > blankPdfsize)
        {
            page = writer.GetImportedPage(r, i);
            writer.AddPage(page);
        }
        bs.Close();
    }
    //close everything
    document.Close();
    writer.Close();
    raf.Close();
    r.Close();}
    

    【讨论】:

    • 将字节数组写入内存流只是为了获取内存流的长度有什么意义??
    • 对不起@Simon,我不记得了! ...这是一个 3 岁的问题。
    • 好吧,我直接在 bContent.Length 上进行了比较,它奏效了。不知道你是否想麻烦更新答案:)
    【解决方案2】:

    我确信有几种方法。但这是我如何做到的一个例子。我只是检查页面上的数据量,如果它小于 20 字节,我不包括它:

    public void removeBlankPdfPages(String pdfSourceFile, String pdfDestinationFile, boolean debug)
        {
            try
            {
                // step 1: create new reader
                PdfReader r = new PdfReader(pdfSourceFile);
                RandomAccessFileOrArray raf = new RandomAccessFileOrArray(pdfSourceFile);
                Document document = new Document(r.getPageSizeWithRotation(1));
                // step 2: create a writer that listens to the document
                PdfCopy writer = new PdfCopy(document, new FileOutputStream(pdfDestinationFile));
                // step 3: we open the document
                document.open();
                // step 4: we add content
                PdfImportedPage page = null;
    
    
                //loop through each page and if the bs is larger than 20 than we know it is not blank.
                //if it is less than 20 than we don't include that blank page.
                for (int i=1;i<=r.getNumberOfPages();i++)
                {
                    //get the page content
                    byte bContent [] = r.getPageContent(i,raf);
                    ByteArrayOutputStream bs = new ByteArrayOutputStream();
                    //write the content to an output stream
                    bs.write(bContent);
                    logger.debug("page content length of page "+i+" = "+bs.size());
                    //add the page to the new pdf
                    if (bs.size() > blankPdfsize)
                    {
                        page = writer.getImportedPage(r, i);
                        writer.addPage(page);
                    }
                    bs.close();
                }
                //close everything
                document.close();
                writer.close();
                raf.close();
                r.close();
            }
            catch(Exception e)
            {
            //do what you need here
            }
        }
    

    【讨论】:

    • 我认为应该检查 pdf 页面的内容,在我的情况下,空白页的大小 > 20 所以它通过了检查,我建议使用此行代码:string extractText = PdfTextExtractor.GetTextFromPage(pdfreader , pageNum, new LocationTextExtractionStrategy());并检查它是否为空或为空
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-07-22
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多