【问题标题】:How to read the text by easyocr correctly?如何正确阅读easyocr的文本?
【发布时间】:2022-08-24 01:33:31
【问题描述】:

我正在尝试从相机模块读取图像,到目前为止,我必须使用自适应滤波以这种方式处理图像。此外,我做了很多操作来裁剪 ROI 并阅读文本。但是,它正在读取数字而不是数字旁边的单位,这些单位的大小相对较小。我该如何解决这个问题?

import easyocr 
import cv2
import numpy as np

import matplotlib.pyplot as plt
import time
import urllib.request
url = \'http://192.168.137.108/cam-hi.jpg\'
while True:
    img_resp=urllib.request.urlopen(url)
    imgnp=np.array(bytearray(img_resp.read()),dtype=np.uint8)
    image = cv2.imdecode(imgnp,-1)
    image = cv2.medianBlur(image,7)
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)    #to gray convert
    th3 = cv2.adaptiveThreshold(gray_image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\\
                cv2.THRESH_BINARY,11,2) #adaptive threshold gaussian filter used
    kernel = np.ones((5,5),np.uint8)
    opening = cv2.morphologyEx(th3, cv2.MORPH_OPEN, kernel)
    

    x = 0   #to save the position, width and height for contours(later used)
    y = 0
    w = 0
    h = 0

    cnts = cv2.findContours(opening, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    threshold =  10
    font = cv2.FONT_HERSHEY_SIMPLEX  
    org = (50, 50) 
    fontScale = 1 
    color = (0, 0, 0)
    thickness = 2
        
    for c in cnts:
        
        approx = cv2.approxPolyDP(c,0.01*cv2.arcLength(c,True),True)
        area = cv2.contourArea(c)   
        if  len(approx) == 4 and area > 100000:   #manual area value used to find ROI for rectangular contours
        
            cv2.drawContours(image,[c], 0, (0,255,0), 3)
            n = approx.ravel()
            font = cv2.FONT_HERSHEY_SIMPLEX
            (x, y, w, h) = cv2.boundingRect(c)
            old_img = opening[y:y+h, x:x+w]  #selecting the ROI
            width, height = old_img.shape
            cropped_img = old_img[50:int(width/2), 0:height] #cropping half of the frame of ROI to just focus on the number
            
            new = reader.readtext(cropped_img)   #reading text using easyocr
            if(new == []): 
                text = \'none\'
            else:
                text = new
                print(text)
#                 cv2.rectangle(cropped_img, tuple(text[0][0][0]), tuple(text[0][0][2]), (0, 0, 0), 2)
                if(text[0][2] > 0.5): #checking the confidence level
                    
                    cv2.putText(cropped_img, text[0][1], org, font, fontScale, color, thickness, cv2.LINE_AA)        
            cv2.imshow(\'frame1\',cropped_img)
    key = cv2.waitKey(5) 

    if key == 27:
        break

cv2.waitKey(0)
cv2.destroyAllWindows()
    
    
  • 您问题中的代码已损坏。在 python 中,缩进是语法。请edit 并修复。
  • 是的。我这样做了。从 python 复制到 stackoverflow 时,缩进搞砸了。但是,我只需要知道如何解决这个问题。代码没有问题,只是easyocr无法读取某些文本。
  • 请查看minimal reproducible example。的截图输出不适合输入数据运行您的代码并重现问题。
  • @RitikaShrestha 你可以分享原始图像吗?
  • @JeruLuke 刚刚编辑了帖子。

标签: python opencv image-processing ocr easyocr


【解决方案1】:

这是我能得到的最好的。希腊符号'' 被识别为 'p'。我还尝试搜索与easyocr 相关的希腊语言模型,但找不到任何内容。

这是我所做的:

  • 对整个图像执行 Otsu 阈值
  • 选择面积最大的轮廓并对其进行裁剪
  • 将裁剪后的图像转换为 LAB 色彩空间
  • 在 A 通道上手动执行二进制阈值

我得到以下信息:

将此图像作为输入传递给easyocr

from easyocr import Reader
reader = Reader(['en'])

# input is the cropped image
results = reader.readtext(crop_img)

# convert to LAB space
lab = cv2.cvtColor(crop_img, cv2.COLOR_BGR2LAB)

# threshold on A-channel
r,th = cv2.threshold(lab[:,:,1],125,255,cv2.THRESH_BINARY_INV)

# create copy of cropped image
crop_img2 = crop_img.copy()

# draw only first 5 results for clarity
# borrowed from: https://pyimagesearch.com/2020/09/14/getting-started-with-easyocr-for-optical-character-recognition/
for (bbox, text, prob) in results[:5]:
  (tl, tr, br, bl) = bbox
  tl = (int(tl[0]), int(tl[1]))
  tr = (int(tr[0]), int(tr[1]))
  br = (int(br[0]), int(br[1]))
  bl = (int(bl[0]), int(bl[1]))
  crop_img2 = cv2.rectangle(crop_img2, tl, br, (0, 0, 255), 3)
  crop_img2 = cv2.putText(crop_img2, text, (tl[0], tl[1] - 20), cv2.FONT_HERSHEY_SIMPLEX, 1.1, (0, 0, 0), 5)

【讨论】:

    【解决方案2】:

    如果您尝试清除图像并将路径传递给以下方法,则可以尝试

    def text_extraction(image, lang_code='en'):
        reader = easyocr.Reader([lang_code], gpu=False)
        roi = cv2.imread(image)#[85:731, 265:1275]
        output = reader.readtext(roi)
        # it returns list of tuple with ([x,y coordinates],text,text_threshold)
        return output
    

    【讨论】:

      猜你喜欢
      • 2018-04-03
      • 1970-01-01
      • 2021-08-26
      • 1970-01-01
      • 2013-07-23
      • 1970-01-01
      • 1970-01-01
      • 2021-02-02
      • 1970-01-01
      相关资源
      最近更新 更多