Pytesseract - 图像上的 OCR，带有不同颜色的文本答案

【问题标题】：Pytesseract - OCR on image with text in different colorsPytesseract - 图像上的 OCR，带有不同颜色的文本
【发布时间】：2020-07-22 20:30:48
【问题描述】：

当文本以不同颜色出现时，Pytesseract 无法提取文本。我尝试使用 opencv 来反转图像，但它不适用于深色文本颜色。

图片：

import cv2
import pytesseract

from PIL import Image


def text(image):
    image = cv2.resize(image, (0, 0), fx=7, fy=7)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    cv2.imwrite("gray.png", gray)

    blur = cv2.GaussianBlur(gray, (3, 3), 0)
    cv2.imwrite("gray_blur.png", blur)

    thresh = cv2.threshold(blur, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    cv2.imwrite("thresh.png", thresh)

    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
    opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
    cv2.imwrite("opening.png", opening)

    invert = 255 - opening
    cv2.imwrite("invert.png", invert)

    data = pytesseract.image_to_string(invert, lang="eng", config="--psm 7")
    return data

有没有办法从给定的图像中提取两个文本：DEADLINE(red) 和 WHITE HOUSE(white)

【问题讨论】：

标签： python opencv python-imaging-library ocr python-tesseract

【解决方案1】：

您可以使用ImageOps 反转图像。并将图像二进制化。

import pytesseract
from PIL import Image,ImageOps
import numpy as np

img = Image.open("OCR.png").convert("L")
img = ImageOps.invert(img)
# img.show()
threshold = 240
table = []
pixelArray = img.load()
for y in range(img.size[1]):  # binaryzate it
    List = []
    for x in range(img.size[0]):
        if pixelArray[x,y] < threshold:
            List.append(0)
        else:
            List.append(255)
    table.append(List)

img = Image.fromarray(np.array(table)) # load the image from array.
# img.show()

print(pytesseract.image_to_string(img))

结果：

img到底是这样的：

【讨论】：

感谢分享！我会尝试一下并恢复
当我尝试执行它时出现错误。 img = Image.fromarray(np.array(table)) # load the image from array.raise TypeError("Cannot handle this data type: %s, %s" % typekey) TypeError: Cannot handle this data type: (1, 1), <i8
@Abhi 我想你可能修改了我的代码。或者你可以使用img = Image.fromarray(np.array(table,dtype="uint8"))。
使用dtype="uint8"后就可以使用了。是否可以使代码更通用？上面的示例无法检测到图像的文本，例如 - i.ibb.co/d4kshyH/croped-2.png。
感谢您的反馈，我会接受您的解决方案，因为它回答了我的初始查询！