为什么 Tesseract 的边界框未在图像文本上对齐？

【问题标题】：Why are the Bounding Boxes from Tesseract not aligned on the image text?为什么 Tesseract 的边界框未在图像文本上对齐？
【发布时间】：2021-12-03 14:00:08
【问题描述】：

我正在使用 tesseract R 包来识别图像文件中的文本。但是，在绘制单词的边界框时，坐标似乎不正确。

为什么单词“This”的边界框与图像中的文本“This”不对齐？
有没有更简单的方法来绘制图像上的所有边界框矩形？

library(tesseract)
library(magick)
library(tidyverse)

text <- tesseract::ocr_data("http://jeroen.github.io/images/testocr.png")
image <- image_read("http://jeroen.github.io/images/testocr.png")

text <- text %>% 
  separate(bbox, c("x1", "y1", "x2", "y2"), ",") %>% 
  mutate(
    x1 = as.numeric(x1),
    y1 = as.numeric(y1),
    x2 = as.numeric(x2),
    y2 = as.numeric(y2)
  )

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = text$y1[1], 
  xright = text$x2[1], 
  ytop = text$y2[1])

【问题讨论】：

标签： r ocr tesseract bounding-box

【解决方案1】：

这仅仅是因为图像的 x、y 坐标是从左上角开始计算的，而rect 是从左下角开始计算的。图片是 480 像素高，所以我们可以这样做：

plot(image)
rect(
  xleft = text$x1[1], 
  ybottom = 480 - text$y1[1], 
  xright = text$x2[1], 
  ytop = 480 - text$y2[1])

或者，为了说明这一点：

plot(image)

rect(
  xleft = text$x1, 
  ybottom = magick::image_info(image)$height - text$y1, 
  xright = text$x2, 
  ytop = magick::image_info(image)$height - text$y2,
  border = sample(128, nrow(text)))

【讨论】：

太棒了！感谢您的见解和示例代码！
其实@Silvan, rect 已经矢量化了，所以可以根据我的编辑来简化代码。