OCR 给出错误的输出答案

【问题标题】：OCR gives wrong outputOCR 给出错误的输出
【发布时间】：2021-02-11 23:40:11
【问题描述】：

我目前正忙于一个项目，该项目使用 Python 读取 FIFA 20 游戏视频中的屏幕文本。我将每 x 帧拍摄一个“屏幕截图”，这样我就可以将 pytesseract 用于 OCR。我有一个来自 FIFA 20 游戏的 sn-p，其中包含我想提取的单词/数字（时间、比分和两个团队名称）。

唯一的问题是我想裁剪图像，所以我将比赛时间、球队名称、比分、对手名称作为不同的图片，这样我就可以使用 OCR 读取每张图片，因为整个图片没有用 pytesseract 很好地学习。我已经尝试了一些过滤器、边缘检测 (cv2.Canny()) 等，但我没有得到我需要的正确输出。由于时间和队名/分数之间的差距，我有一些奇怪的字符，以及分数中的一些奇怪的字符（因为黑色背景？）。

所以我的问题是，解决这个问题的最佳方法是什么？有没有办法制作一个自适应裁剪的东西，它把球队名称、时间和分数都剪成不同的图片，这样我就可以对它们单独使用 OCR？还是有其他方法可以做到这一点？

提前致谢！

Data I want to retrieve from image

编辑：是的，我尝试通过使用以下代码过滤除团队名称的黄色以外的所有颜色来制作蒙版：

roi_teamnames = image[55:90, 120:900]
roi_teamnames = cv2.cvtColor(roi_teamnames, cv2.COLOR_BGR2HSV)
lower = np.array([0, 25, 147], dtype="uint8")
upper = np.array([32, 255, 255], dtype="uint8")
roi_teamnames = cv2.inRange(roi_teamnames, lower, upper)
cv2.imshow("Teamnames",roi_teamnames)

这给了我一个很好的结果，我可以使用（见链接）。

Result with mask

但现在的问题是，有没有办法自动检测空白，以便我可以将图像裁剪成 2 个不同的图像，每个图像都包含不同的名称？

【问题讨论】：

欢迎来到 Stack Overflow！请将您的代码尝试添加到问题中。它可以帮助其他访问者提出可行的解决方案。

标签： python opencv ocr cv2 python-tesseract

【解决方案1】：

**这是我在stackoverflow某处找到的用于检测图像中文本的代码，这可能对您有所帮助**

import cv2

# Load image, grayscale, Gaussian blur, adaptive threshold
image = cv2.imread(r'E:\tesseract.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (9,9), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,30)

# Dilate to combine adjacent text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
dilate = cv2.dilate(thresh, kernel, iterations=4)

# Find contours, highlight text areas, and extract ROIs
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

ROI_number = 0
for c in cnts:
    area = cv2.contourArea(c)
    if area > 100:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()

【讨论】：

【解决方案2】：

这是从您上面提供的图像中提取团队名称、时间、分数的代码。有一些假设（参见代码）可能是足够的......或者不是。你得试试。代码：

from PIL import Image, ImageOps
import pytesseract
from pytesseract import Output

img = Image.open('image.png')
gray=ImageOps.grayscale(img)

# We ASSUME teams names are always in capital letters, plus the following signs: "-","/" and "." (add more if needed)
letters=pytesseract.image_to_data(gray, output_type=Output.DICT, config='-c tessedit_char_whitelist=\ -/.ABCDEFGHIJKLMNOPQRSTUVWXYZ' )

# For the numbers (time, scores): 
# We ASSUME time and scores are always in light caracters on a dark background. So we work on the color inverted image because 
# light caracters on black background are not properly recognized.
# We ASSUME times and scores are made of digits, plus ":" and "-" 
numbers=pytesseract.image_to_data(ImageOps.invert(gray), output_type=Output.DICT, config='-c tessedit_char_whitelist=:-0123456789' )


# Working on numbers:
conf_numbers=[x for x in numbers['conf']] # confidence indice for detection
data_num=[]
for i in range(0,len(conf_numbers)):
    if int(conf_numbers[i])>-1:  # contains numbers
        data_num.append(numbers['text'][i])

time=data_num[0]
score=data_num[1]


# Working on letters:
conf_letters=[int(item) for item in letters['conf']]
words=[]
for i in range(0,len(conf_letters)):
    if conf_letters[i]>75:   # confidence threshold (0 - 100%) Note: You can print these values to have a feeling about the needed threshold
        words.append(letters['text'][i])
    else:
        words.append('*')
words.append('*')


# Assembling words to make teams names:
Flag=0
teams_names=[]
team=''
for i in range(0, len(words)):    
    
    if Flag==1 and not(words[i]=='*'):       
        team=team+words[i]
        
    if Flag==1 and words[i]=='*':
        teams_names.append(team)
        team=''
        Flag=0

    if Flag==0 and not(words[i]=='*'):
        Flag=1
        team=words[i]+' '
team_A=teams_names[0]
team_B=teams_names[1]

print('Team A: ',team_A)
print('Team B: ',team_B)
print('Time: ',time)
print('Score: ',score)

输出是：

Team A:  WOLVES FIFILZA
Team B:  FNATIC TEKKZ
Time:  71:26
Score:  0-1

最好的问候，

史蒂芬

【讨论】：