【问题标题】:Creating image (PNG or JPEG) from PDF along with HTML image maps of text in the image?从 PDF 创建图像(PNG 或 JPEG)以及图像中文本的 HTML 图像映射?
【发布时间】:2019-01-25 07:09:07
【问题描述】:

我正在记录我维护的系统。本文档包含我在 TeX/TikZ 中创建的图表,该图表被渲染为 PDF 文件。然后我将 PDF 文件转换为图像文件(通过 imagemagick 的 PNG),并将其包含在我的 HTML 文档中。效果很好。

现在我想为图像创建一个image map,以便我可以添加超链接/鼠标悬停/等。这是一张我希望根据我的系统变化定期更新的图像,所以如果可能的话,我想自动化这个过程。

有没有办法使用软件库或工具在 PDF 文件中的各种文本内容被渲染为 PNG 时自动创建图像映射?

这是我创建的this gist 的示例:

在这种情况下,我想通过在 PDF 中定位它们的边界框来将一些不同的文本字符串转换为超链接:

  • controller
  • actuator
  • sensor
  • A
  • B
  • C
  • D
  • u
  • y
  • F(s)
  • G(s)
  • H(s)

(它们都是 PDF 文件中的文本内容;我可以在 Acrobat Reader 中选择其中任何一个文本,然后复制并粘贴到我的文本编辑器中。)

有没有办法做到这一点?

【问题讨论】:

  • 我认为你必须为这个解决方案增加 250(甚至更多)额外的赏金。这个人为你工作了很久!
  • 另一个需要考虑的方法是将 TeX 文件转换为 SVG。 SVG 支持可点击的 href

标签: html pdf imagemap


【解决方案1】:

我能够将以下 Python 解决方案放在一起,作为起点。它将pdf转换为png并输出相应的图像地图标记。

它将输出 dpi 作为可选参数(默认为 200),以便将边界框从默认的 pdf dpi 72 正确缩放到 png 上:

from pdf2image import convert_from_path
from pdfminer.converter import PDFPageAggregator
from pdfminer.layout import LAParams, LTTextBox
from pdfminer.pdfinterp import PDFPageInterpreter
from pdfminer.pdfinterp import PDFResourceManager
from pdfminer.pdfpage import PDFPage

from yattag import Doc, indent

import argparse
import os


def transform_coords(lobj, mb):

    # Transform LTTextBox bounding box to image map area bounding box.
    #
    # The bounding box of each LTTextBox is specified as:
    #
    # x0: the distance from the left of the page to the left edge of the box
    # y0: the distance from the bottom of the page to the lower edge of the box
    # x1: the distance from the left of the page to the right edge of the box
    # y1: the distance from the bottom of the page to the upper edge of the box
    #
    # So the y coordinates start from the bottom of the image. But with image map
    # areas, y coordinates start from the top of the image, so here we subtract
    # the bounding box's y-axis values from the total height.

    return [lobj.x0, mb[3] - lobj.y1, lobj.x1, mb[3] - lobj.y0]


def get_imagemap(d):
    doc, tag, text = Doc().tagtext()
    with tag("map", name="map"):
        for k, v in d.items():
            doc.stag("area", shape="rect", coords=",".join(v), href="", alt=k)
    return indent(doc.getvalue())


def get_bboxes(pdf, dpi):
    fp = open(pdf, "rb")
    rsrcmgr = PDFResourceManager()
    device = PDFPageAggregator(rsrcmgr, laparams=LAParams())
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    page = list(PDFPage.get_pages(fp))[0]

    interpreter.process_page(page)
    layout = device.get_result()

    # PDFminer reports bounding boxes based on a dpi of 72. I could not find a way
    # to change this, so instead I scale each coordinate by multiplying by dpi/72
    scale = dpi / 72.0

    return {
        lobj.get_text().strip(): [
            str(int(x * scale)) for x in transform_coords(lobj, page.mediabox)
        ]
        for lobj in layout
        if isinstance(lobj, LTTextBox)
    }


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("pdf")
    parser.add_argument("--dpi", type=int, default=200)

    args = parser.parse_args()

    page = list(convert_from_path(args.pdf, args.dpi))[0]
    page.save(f"{os.path.splitext(args.pdf)[0]}.png", "PNG")

    print(get_imagemap(get_bboxes(args.pdf, args.dpi)))


if __name__ == "__main__":
    main()

示例结果:

<img src="https://i.stack.imgur.com/aXWMc.png" usemap="#map">
<map name="map">
  <area shape="rect" coords="361,8,380,43" href="#" alt="B" />
  <area shape="rect" coords="434,31,500,64" href="#" alt="G(s)" />
  <area shape="rect" coords="432,93,502,117" href="#" alt="actuator" />
  <area shape="rect" coords="552,8,572,42" href="#" alt="C" />
  <area shape="rect" coords="596,58,609,86" href="#" alt="y" />
  <area shape="rect" coords="105,26,119,40" href="#" alt="+" />
  <area shape="rect" coords="107,54,122,78" href="#" alt="−" />
  <area shape="rect" coords="35,58,51,86" href="#" alt="u" />
  <area shape="rect" coords="164,8,182,43" href="#" alt="A" />
  <area shape="rect" coords="163,152,183,187" href="#" alt="D" />
  <area shape="rect" coords="241,31,311,64" href="#" alt="H(s)" />
  <area shape="rect" coords="236,94,316,118" href="#" alt="controller" />
  <area shape="rect" coords="243,175,309,208" href="#" alt="F (s)" />
  <area shape="rect" coords="247,234,305,258" href="#" alt="sensor" />
</map>

【讨论】:

  • 哇!试图消化这个......(我知道 Python)你能澄清一下 transform_coords 在做什么吗?
  • @JasonS 当然,我在答案中的代码中添加了一些 cmets
  • @JasonS 哇!非常感谢!
【解决方案2】:

嗯。我找到了 Apache PDFBox 库,其中包含一个名为 PrintLocations.java 的示例,它可以打印信息,但我不知道如何解释它,而且每个字形只有一个位置。

> java -jar print_text_locations.jar blockdiagram_example.pdf
String[37.864998,13.939003 fs=4.9813 xscale=4.9813 height=2.49065 space=2.4906502 width=5.1197815]+
String[59.185997,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.6450577]A
String[130.229,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.64505]B
String[198.783,13.498001 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]C
String[86.827,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.699257]H
String[97.449005,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[102.00201,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[107.51601,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
String[156.35,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.234192]G
String[165.58301,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[170.136,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.513733]s
String[175.65,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
String[12.797,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=5.7035875]u
String[38.711,27.432999 fs=4.9813 xscale=4.9813 height=3.4022279 space=2.4906502 width=5.39624]?
String[214.641,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=4.884659]y
String[85.109,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]c
String[88.5959,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[92.473335,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[96.35077,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387131]t
String[98.28948,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
String[100.611755,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[104.48919,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[106.03738,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[107.58556,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[111.463,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
String[155.67801,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[159.55544,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4868927]c
String[163.04233,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[164.98105,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]u
String[168.85847,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[172.7359,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[174.67462,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]o
String[178.55205,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.322281]r
String[58.912003,65.483 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]D
String[87.536,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=7.577202]F
String[96.740005,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[101.29201,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[106.80601,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.5525436])
String[88.983,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[92.4699,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[96.347336,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[100.22477,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[103.71167,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[107.5891,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r

不过,我确实做了一个小改动,看起来writeString 方法会为每个文本项调用,我想我可以找到每个字符串的整体边界矩形:

/**
 * Override the default functionality of PDFTextStripper.
 */
@Override
protected void writeString(String string, List<TextPosition> textPositions) throws IOException
{
    System.out.println("text string: "+string);
    for (TextPosition text : textPositions)
    {
        System.out.println( "String[" + text.getXDirAdj() + "," +
                text.getYDirAdj() + " fs=" + text.getFontSize() + " xscale=" +
                text.getXScale() + " height=" + text.getHeightDir() + " space=" +
                text.getWidthOfSpace() + " width=" +
                text.getWidthDirAdj() + "]" + text.getUnicode() );
    }
}

github gist 中 pdf 文件的输出:

> java -jar pdf2imagemap.jar blockdiagram_example.pdf
text string: +
String[37.864998,13.939003 fs=4.9813 xscale=4.9813 height=2.49065 space=2.4906502 width=5.1197815]+
text string: A
String[59.185997,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.6450577]A
text string: B
String[130.229,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.64505]B
text string: C
String[198.783,13.498001 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]C
text string: H(s)
String[86.827,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.699257]H
String[97.449005,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[102.00201,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[107.51601,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
text string: G(s)
String[156.35,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.234192]G
String[165.58301,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[170.136,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.513733]s
String[175.65,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
text string: u
String[12.797,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=5.7035875]u
text string: ?
String[38.711,27.432999 fs=4.9813 xscale=4.9813 height=3.4022279 space=2.4906502 width=5.39624]?
text string: y
String[214.641,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=4.884659]y
text string: controller
String[85.109,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]c
String[88.5959,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[92.473335,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[96.35077,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387131]t
String[98.28948,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
String[100.611755,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[104.48919,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[106.03738,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[107.58556,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[111.463,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
text string: actuator
String[155.67801,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[159.55544,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4868927]c
String[163.04233,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[164.98105,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]u
String[168.85847,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[172.7359,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[174.67462,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]o
String[178.55205,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.322281]r
text string: D
String[58.912003,65.483 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]D
text string: F
String[87.536,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=7.577202]F
text string: (s)
String[96.740005,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[101.29201,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[106.80601,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.5525436])
text string: sensor
String[88.983,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[92.4699,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[96.347336,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[100.22477,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[103.71167,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[107.5891,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2012-05-15
    • 2011-11-24
    • 2019-08-30
    • 2014-08-04
    • 2020-10-12
    • 1970-01-01
    • 2014-09-09
    • 2013-10-03
    相关资源
    最近更新 更多