pytesseract image_to_string 不够准确答案

【问题标题】：pytesseract image_to_string not accurate enoughpytesseract image_to_string 不够准确
【发布时间】：2020-09-12 07:56:14
【问题描述】：

我想使用python（本语言中的newby ...）从裁剪的数独图片中循环读取数字并使用谷歌搜索建议使用pytesseract，

首先我尝试使用PIL 来阅读图片

from PIL import Image
import pytesseract

image = Image.open('./test.png')

width, height = image.size
left = 0
top = 0
i = 0
j = 0
while (top < height):
    while (left < width):
        crop_img = image.crop((left, top, left + width / 9,  top + height / 9))
        print(i, j, pytesseract.image_to_string(crop_img, config='--psm 6'))
        left += width / 9
        j += 1
    top += height / 9
    i += 1
    left = 0
    j = 0

print的结果是这样的

不够准确，但还不错。

所以我的第二次尝试是使用cv2 而不是PIL，并且正如其他答案中所建议的那样，我将图片转换为白色背景上的黑色文本（可能是它有点混乱而不是最佳实践，欢迎提示： ))

import pytesseract
import cv2

image = cv2.imread('./test.png', 0)
height, width = image.shape
left = 0
top = 0
i = 0
j = 0
while (top < height):
    while (left < width):
        crop_img = image[int(top):int(top + height/9),
                         int(left):int(left + width/9)]
        thresh = cv2.threshold(
            crop_img, 155, 255, cv2.THRESH_BINARY_INV)[1]
        result = cv2.GaussianBlur(thresh, (5, 5), 0)
        result = 255 - result
        print(i, j, pytesseract.image_to_string(result, config='--psm 6'))
        left += width / 9
        j += 1
    top += height / 9
    i += 1
    left = 0
    j = 0

什么给了我

在这两种情况下，我都保存了（.save(} 用于PIL 和imwrite 用于cv2）用于调试的裁剪图像，实际上图片非常清晰，例如在cv2cropped{ 2, 2 } 现场（评估为空点）裁剪后的 img 是

完整的数独图像

提前致谢！

【问题讨论】：

标签： python python-tesseract opencv-python

【解决方案1】：

为此，我将 OpenCV 用于图像，然后将板保存到 numpy 数组中。我做的主要事情是为image_to_string() 调用添加配置参数，以将输出限制为数字。但这确实需要一段时间，因为它会像我认为您在原始版本中那样单独预测每个数字。

import cv2
import numpy as np
import pytesseract

im = cv2.resize(cv2.imread('./test.png'), (900, 900))

out = np.zeros((9, 9), dtype=np.uint8)

for x in range(9):
    for y in range(9):
        num = pytesseract.image_to_string(im[10 + x*100:(x+1)*100 - 10, 10 + y*100:(y+1)*100 - 10, :], config='--psm 6 --oem 1 -c tessedit_char_whitelist=0123456789')
        if num:
            out[x, y] = num

这给了我你帖子中图片的输出，0s 作为空格。

array([[5, 3, 0, 0, 7, 0, 0, 0, 0],
       [6, 0, 0, 1, 9, 5, 0, 0, 0],
       [0, 9, 8, 0, 0, 0, 0, 6, 0],
       [8, 0, 0, 0, 6, 0, 0, 0, 3],
       [4, 0, 0, 8, 0, 3, 0, 0, 1],
       [7, 0, 0, 0, 2, 0, 0, 0, 6],
       [0, 6, 0, 0, 0, 0, 2, 8, 0],
       [0, 0, 0, 4, 1, 9, 0, 0, 5],
       [0, 0, 0, 0, 8, 0, 0, 7, 9]], dtype=uint8)

这不是最干净的，但它似乎工作得很好。

【讨论】：