是否可以使用 pytesseract 从图像的特定部分提取文本答案

【问题标题】：Is it possible to extract text from specific portion of image using pytesseract是否可以使用 pytesseract 从图像的特定部分提取文本
【发布时间】：2020-03-15 20:12:40
【问题描述】：

我在图像中有边界框（矩形坐标），并希望在该坐标内提取文本。如何使用 pytesseract 提取该坐标内的文本？

我尝试使用类似 opencv 的方式将图像部分复制到其他 numpyarray

cropped_image = image[y1:y2][x1:x2]

并尝试了 pytesseract.image_to_string()。但是准确性很差。但是当我尝试原始图像到 pytesseract.image_to_string() 时，它完美地提取了所有东西..

是否有任何功能可以使用 pytesseract 提取图像的特定部分？

This image has different sections of information consider I have rectangle coordinates enclosing 'Online food delivering system' how to extract that data in pytessaract?

请帮忙提前致谢

我正在使用的版本：正方体 4.0.0 pytesseract 0.3.0 OpenCv 3.4.3

【问题讨论】：

是的，有可能，但没有您的输入图像我们无法写出答案
好的，我添加了一张图片。请帮帮我。 @nathancy

标签： python image opencv image-processing ocr

【解决方案1】：

没有使用 Pytesseract 提取图像特定部分的内置函数，但我们可以使用 OpenCV 提取 ROI 边界框，然后将此 ROI 放入 Pytesseract。我们将图像转换为灰度然后阈值以获得二值图像。假设你有想要的 ROI 坐标，我们使用 Numpy 切片来提取想要的 ROI

从这里我们把它扔进 Pytesseract 得到我们的结果

ONLINE FOOD DELIVERY SYSTEM

代码

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread('1.jpg', 0)
thresh = 255 - cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

x,y,w,h = 37, 625, 309, 28  
ROI = thresh[y:y+h,x:x+w]
data = pytesseract.image_to_string(ROI, lang='eng',config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.waitKey()

【讨论】：