如何从python中的验证码图像中提取数字？答案

【问题标题】：how to extract numbers from captcha image in python?如何从python中的验证码图像中提取数字？
【发布时间】：2021-10-31 15:27:15
【问题描述】：

我想从验证码图像中提取数字，所以我从这个答案this answer 中尝试了这段代码：

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract
import cv2

file = 'sample.jpg'

img = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, None, fx=10, fy=10, interpolation=cv2.INTER_LINEAR)
img = cv2.medianBlur(img, 9)
th, img = cv2.threshold(img, 185, 255, cv2.THRESH_BINARY)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (4,8))
img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
cv2.imwrite("sample2.jpg", img)


file = 'sample2.jpg'
text = pytesseract.image_to_string(file)
print(''.join(x for x in text if x.isdigit()))

它在这张图片上效果很好：

outPut: 436359
但是，当我在这张图片上尝试它时：

它什么也没给我， outPut: .
如何修改我的代码以从第二张图片中获取字符串形式的数字？

编辑：
我试过Matt's answer，它对上面的图片效果很好。但它无法识别图像 A 中的 (8,1) 和图像 B 中的数字 (7)
图像 A

strong>图片B
如何解决？

【问题讨论】：

我认为你应该使用这个 - How to extract numbers from a complex captcha
@whoamins，它不起作用，正如我提到的，我从你指出的那个问题的答案中获取了代码。
您的代码无法识别此特定图像中的数字，但没有错误，只是在此示例中无法正常工作的选择，这是一个非常困难的测试用例，因为字母都是向不同的方向倾斜。您的目标是正确地对这个特定图像进行 OCR，还是实施一种能够正确识别尽可能多图像的方法？（没有什么是完美的。）
@MattL.，我的目标是正确 OCR 这张图片。

标签： python opencv computer-vision python-imaging-library captcha

【解决方案1】：

通常，在这样的图像上获得恰到好处的 OCR 与转换的顺序和参数有关。例如，在下面的代码sn-p中，我先转换成灰度，然后腐蚀像素，然后膨胀，再腐蚀。我使用阈值转换为二进制（只是黑色和白色），然后再膨胀和腐蚀一次。这对我来说产生了正确的 859917 值并且应该是可重现的。

import cv2
import numpy as np
import pytesseract

file = 'sample2.jpg'
img = cv2.imread(file)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ekernel = np.ones((1,2),np.uint8)
eroded = cv2.erode(gray, ekernel, iterations = 1)
dkernel = np.ones((2,3),np.uint8)
dilated_once = cv2.dilate(eroded, dkernel, iterations = 1)
ekernel = np.ones((2,2),np.uint8)
dilated_twice = cv2.erode(dilated_once, ekernel, iterations = 1)
th, threshed = cv2.threshold(dilated_twice, 200, 255, cv2.THRESH_BINARY)
dkernel = np.ones((2,2),np.uint8)
threshed_dilated = cv2.dilate(threshed, dkernel, iterations = 1)
ekernel = np.ones((2,2),np.uint8)
threshed_eroded = cv2.erode(threshed_dilated, ekernel, iterations = 1)
text = pytesseract.image_to_string(threshed_eroded)
print(''.join(x for x in text if x.isdigit()))

【讨论】：

非常感谢我的朋友，你拯救了我的一天。但是，代码没有读取浅黄色的数字，我应该如何处理？
最明显的是尝试改变阈值过滤函数中的第一个数字。如果转换为黑白，这将控制截止。
但实际上它是关于调整您的设置，以便它们在最大数量的情况下工作。没有什么能完美地适用于每张图片。