这可能是一个牵强附会的问题,但您是否从link 下载了 mc 训练数据?
如果是这样,则此训练数据在某些字符方面存在问题,并且仅适用于数字。另一件重要的事情是尝试剪掉文本周围的背景。
我正在做一个类似的项目here,但有一些不同之处。
(使用 tesserocr 作为视频/大量图像的速度更快)
(阅读保证白色文本的 f3 调试菜单)
如果您查看 process_image,它会获取图像,切掉所有非灰色像素,然后应用 cv2.threshold(im_arr,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
我还在应用所有效果后对第一列进行裁剪,以最大限度地减少开始时的大量空白。
您可以尝试使用而不是检查灰色像素
# Check out hsv masking/filtering in opencv documentation
image = cv2.inRange((h_min,s_min,v_min), (h_max, s_max, v_max), image)
ret3, image = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
def process_image(im, crop_to_activity=False):
"""
Converts the image to a numpy array, then applies preprocessing.
"""
im_arr = np.array(im)
height, width, depth = im_arr.shape
for i in range(height):
for j in range(width):
r, g, b = im_arr[i][j]
r = (r + 150) / 2
g = (g + 150) / 2
b = (b + 150) / 2
mean = (r + g + b) / 3
diffr = abs(mean - r)
diffg = abs(mean - g)
diffb = abs(mean - b)
maxdev = 2
if (diffr + diffg + diffb) > maxdev:
im_arr[i][j][0] = 0
im_arr[i][j][1] = 0
im_arr[i][j][2] = 0
im_arr = cv2.cvtColor(im_arr, cv2.COLOR_BGR2GRAY)
#cap_arr = cv2.threshold(cap_arr,127,255,cv2.THRESH_BINARY)
# Otsu's thresholding after Gaussian filtering
#blur = cv2.GaussianBlur(cap_arr,(3,3),0)
ret3, im_arr = cv2.threshold(im_arr,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
if crop_to_activity:
last_column = -1
for j in range(width):
for i in range(height):
v = im_arr[i][j]
if v != 0:
last_column = j
break
if last_column != -1:
break
last_column = max(0, last_column-3)
im_arr = im_arr[:, last_column:]
return im_arr