如何正确阅读easyocr的文本？答案

【问题标题】：How to read the text by easyocr correctly?如何正确阅读easyocr的文本？
【发布时间】：2022-08-24 01:33:31
【问题描述】：

我正在尝试从相机模块读取图像，到目前为止，我必须使用自适应滤波以这种方式处理图像。此外，我做了很多操作来裁剪 ROI 并阅读文本。但是，它正在读取数字而不是数字旁边的单位，这些单位的大小相对较小。我该如何解决这个问题？

import easyocr 
import cv2
import numpy as np

import matplotlib.pyplot as plt
import time
import urllib.request
url = \'http://192.168.137.108/cam-hi.jpg\'
while True:
    img_resp=urllib.request.urlopen(url)
    imgnp=np.array(bytearray(img_resp.read()),dtype=np.uint8)
    image = cv2.imdecode(imgnp,-1)
    image = cv2.medianBlur(image,7)
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)    #to gray convert
    th3 = cv2.adaptiveThreshold(gray_image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\\
                cv2.THRESH_BINARY,11,2) #adaptive threshold gaussian filter used
    kernel = np.ones((5,5),np.uint8)
    opening = cv2.morphologyEx(th3, cv2.MORPH_OPEN, kernel)
    

    x = 0   #to save the position, width and height for contours(later used)
    y = 0
    w = 0
    h = 0

    cnts = cv2.findContours(opening, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    threshold =  10
    font = cv2.FONT_HERSHEY_SIMPLEX  
    org = (50, 50) 
    fontScale = 1 
    color = (0, 0, 0)
    thickness = 2
        
    for c in cnts:
        
        approx = cv2.approxPolyDP(c,0.01*cv2.arcLength(c,True),True)
        area = cv2.contourArea(c)   
        if  len(approx) == 4 and area > 100000:   #manual area value used to find ROI for rectangular contours
        
            cv2.drawContours(image,[c], 0, (0,255,0), 3)
            n = approx.ravel()
            font = cv2.FONT_HERSHEY_SIMPLEX
            (x, y, w, h) = cv2.boundingRect(c)
            old_img = opening[y:y+h, x:x+w]  #selecting the ROI
            width, height = old_img.shape
            cropped_img = old_img[50:int(width/2), 0:height] #cropping half of the frame of ROI to just focus on the number
            
            new = reader.readtext(cropped_img)   #reading text using easyocr
            if(new == []): 
                text = \'none\'
            else:
                text = new
                print(text)
#                 cv2.rectangle(cropped_img, tuple(text[0][0][0]), tuple(text[0][0][2]), (0, 0, 0), 2)
                if(text[0][2] > 0.5): #checking the confidence level
                    
                    cv2.putText(cropped_img, text[0][1], org, font, fontScale, color, thickness, cv2.LINE_AA)        
            cv2.imshow(\'frame1\',cropped_img)
    key = cv2.waitKey(5) 

    if key == 27:
        break

cv2.waitKey(0)
cv2.destroyAllWindows()

您问题中的代码已损坏。在 python 中，缩进是语法。请edit 并修复。
是的。我这样做了。从 python 复制到 stackoverflow 时，缩进搞砸了。但是，我只需要知道如何解决这个问题。代码没有问题，只是easyocr无法读取某些文本。
请查看minimal reproducible example。的截图输出不适合输入数据运行您的代码并重现问题。
@RitikaShrestha 你可以分享原始图像吗？
@JeruLuke 刚刚编辑了帖子。

标签： python opencv image-processing ocr easyocr

【解决方案1】：

这是我能得到的最好的。希腊符号'亩' 被识别为 'p'。我还尝试搜索与easyocr 相关的希腊语言模型，但找不到任何内容。

这是我所做的：

对整个图像执行 Otsu 阈值
选择面积最大的轮廓并对其进行裁剪
将裁剪后的图像转换为 LAB 色彩空间
在 A 通道上手动执行二进制阈值

我得到以下信息：

将此图像作为输入传递给easyocr：

from easyocr import Reader
reader = Reader(['en'])

# input is the cropped image
results = reader.readtext(crop_img)

# convert to LAB space
lab = cv2.cvtColor(crop_img, cv2.COLOR_BGR2LAB)

# threshold on A-channel
r,th = cv2.threshold(lab[:,:,1],125,255,cv2.THRESH_BINARY_INV)

# create copy of cropped image
crop_img2 = crop_img.copy()

# draw only first 5 results for clarity
# borrowed from: https://pyimagesearch.com/2020/09/14/getting-started-with-easyocr-for-optical-character-recognition/
for (bbox, text, prob) in results[:5]:
  (tl, tr, br, bl) = bbox
  tl = (int(tl[0]), int(tl[1]))
  tr = (int(tr[0]), int(tr[1]))
  br = (int(br[0]), int(br[1]))
  bl = (int(bl[0]), int(bl[1]))
  crop_img2 = cv2.rectangle(crop_img2, tl, br, (0, 0, 255), 3)
  crop_img2 = cv2.putText(crop_img2, text, (tl[0], tl[1] - 20), cv2.FONT_HERSHEY_SIMPLEX, 1.1, (0, 0, 0), 5)

【讨论】：

【解决方案2】：

如果您尝试清除图像并将路径传递给以下方法，则可以尝试

def text_extraction(image, lang_code='en'):
    reader = easyocr.Reader([lang_code], gpu=False)
    roi = cv2.imread(image)#[85:731, 265:1275]
    output = reader.readtext(roi)
    # it returns list of tuple with ([x,y coordinates],text,text_threshold)
    return output

【讨论】：