【发布时间】:2018-01-18 14:58:33
【问题描述】:
我在使用 Python、OpenCV 和 OCR 从图像中读取文本的示例代码时遇到问题。
这段代码是用 python 2.7 构建的,我使用的是 python 3.6,所以可能我错过了这些版本之间的一些变化。
import cv2
import numpy as np
import pytesseract
from PIL import Image
src_path = "C:/Users/crist/Desktop/borrar/lectura/"
def get_string(img_path):
# Read image with opencv
img = cv2.imread(img_path)
# Convert to gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion to remove some noise
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
# Write image after removed noise
cv2.imwrite(src_path + "removed_noise.png", img)
# Apply threshold to get image with only black and white
#img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
# Write the image after apply opencv to do some ...
cv2.imwrite(src_path + "thres.png", img)
# Recognize text with tesseract for python
result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))
#os.remove(temp)
return result
print ('--- Start recognize text from image ---')
print (get_string(src_path + "2.png"))
print ("------ Done -------")
错误:
--- Start recognize text from image ---
Traceback (most recent call last):
File "C:/Users/crist/PycharmProjects/LectorTexto/lectorCapcha.py", line 40, in <module>
print (get_string(src_path + "2.png"))
File "C:/Users/crist/PycharmProjects/LectorTexto/lectorCapcha.py", line 31, in get_string
result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))
File "C:\Users\crist\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 122, in image_to_string
config=config)
File "C:\Users\crist\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 46, in run_tesseract
proc = subprocess.Popen(command, stderr=subprocess.PIPE)
File "C:\Users\crist\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 707, in __init__
restore_signals, start_new_session)
File "C:\Users\crist\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 992, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] El sistema no puede encontrar el archivo especificado
Process finished with exit code 1
【问题讨论】:
-
错误似乎在
pytesseract.image_to_string()行。您尝试访问的图像在那个时间点是否真的存在? -
您的错误似乎是“系统找不到指定的文件”。这通常意味着文件不存在或路径错误。
-
您好,感谢您的宝贵时间!是的,确实如此,我的意思是,使用实际的文本识别,一切正常。
-
1) 使用 openCV 读取原始图像 2) 将其转换为灰度 3) 腐蚀、扩张和保存 4) 黑白变换和保存 直到一切正常!我可以转到文件夹并查看带有转换的新图像。我想如果它真的是一个路径问题,它将无法找到第一个图像,不知道还能做什么!谢谢!
-
我通过 pip 安装了 pytesseract,还有什么我错过的吗?我必须安装 tesseract 和 pytesseract 吗?
标签: python python-3.x opencv ocr