【发布时间】:2021-06-09 10:07:51
【问题描述】:
我想在所有问题和该问题的各个选项上绘制一个边界框,然后我想从每个问题中提取文本并放入一个 Pandas 数据框,稍后将导出到 Excel。为此,我有一个 python 文件,可以检测四个选项 [(a),(b),(c),(d)] 和问题。但是这里的问题是当我提取整个图像时(没有任何边界框) PyTesseract 为我提供了所需的输出,但是当我尝试从边界框中提取它时,它会在文本检测中产生很多错误。我在下面附上了我的python文件。有人能告诉我如何正确检测这些边界框中的文本吗?
Python 代码:
# read the image using OpenCV
image = cv2.imread("E:\PythonTarget.jpg")
# make a copy of this image to draw in
image_copy = image.copy()
# the target word to search for
target_word_a = "(a)"
target_word_b = "(b)"
target_word_c = "(c)"
target_word_d = "(d)"
# get all data from the image
data = tess.image_to_data(image, output_type=tess.Output.DICT)
# get all occurences of the that word
word_occurences_a = [i for i, word in enumerate(data["text"]) if word.lower() == target_word_a]
word_occurences_b = [i for i, word in enumerate(data["text"]) if word.lower() == target_word_b]
word_occurences_c = [i for i, word in enumerate(data["text"]) if word.lower() == target_word_c]
word_occurences_d = [i for i, word in enumerate(data["text"]) if word.lower() == target_word_d]
for occ in word_occurences_a:
# extract the width, height, top and left position for that detected word
w = data["width"][occ] + 1000
h = data["height"][occ]
l = data["left"][occ]
t = data["top"][occ]
# define all the surrounding box points
p1 = (l, t)
p2 = (l + w, t)
p3 = (l + w, t + h)
p4 = (l, t + h)
# draw the 4 lines (rectangular)
image_copy = cv2.line(image_copy, p1, p2, color=(255, 0, 0), thickness=4)
image_copy = cv2.line(image_copy, p2, p3, color=(255, 0, 0), thickness=4)
image_copy = cv2.line(image_copy, p3, p4, color=(255, 0, 0), thickness=4)
image_copy = cv2.line(image_copy, p4, p1, color=(255, 0, 0), thickness=4)
#Turn the bounding box to a cv2 image
crop = image_copy[t: t + h, l:l + w]
#Extract text from the cv2 image
results = tess.image_to_string(crop)
#print the extracted text
print(results)
【问题讨论】:
-
可以分享原图吗?还是上传的图片是未经修改的原图?
-
是的,上传的图片是原图。
标签: python opencv python-tesseract bounding-box