【问题标题】:Tesseract can't find word for some odd reason由于某些奇怪的原因,Tesseract 找不到单词
【发布时间】:2021-08-25 21:15:12
【问题描述】:

我可以从其他图像中找到其他单词,但似乎这个案例似乎证明了对 Tesseract 的挑战。

我可以在这张图片中准确找到“描述”这个词:

但即使在此处多次尝试和模糊尝试,“桌面”一词也不会获得任何结果:

这是我的代码:

def OCRscreen(keyword, justCaseCheck):
    # One can just type 'OCRscreen( <WORD YOU'RE LOOKING FOR> )' to find its coordinates
    # If one would like to just see if it's on screen, put "TRUE" after the keyword
    
    # Waiting until the page loads
    numTries = 0        # keeps track of the number of times 
    maxNumTries = 3     # max number of tries
    while True:

        pyautogui.screenshot('imageToOCR.png')     # taking screenshot of desktop

        #have Tesseract read it and convert to GRAY SCALE
        readImage = cv2.imread('imageToOCR.png')        # will pyautogui.screenshot overwrite itself?
        gray = cv2.cvtColor(readImage, cv2.COLOR_BGR2GRAY)
        thresh = 255 - cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]      # using Otsu's threshold to binarize foreground from background


        #Write GRAY SCALE image to disk as a temp file to OCR it
        tempFile = "{}.png".format(os.getpid() )
        cv2.imwrite(tempFile, thresh) # gray)

        results = pytesseract.image_to_data(Image.open(tempFile), lang='eng', config='--psm 6', output_type=Output.DICT)       

        # Checks to see if our keyword showed up on screen
        if keyword not in results["text"]:
            print("Attempted without blur. Now trying a blurred image... ")

            # Try blurring this time
            blurredImage = cv2.GaussianBlur(thresh, (3,3), 0)     # blur before processing
            tempFile_blur = "{}.png".format(os.getpid() )
            cv2.imwrite(tempFile_blur, blurredImage)
            results = pytesseract.image_to_data(Image.open(tempFile), lang='eng', config='--psm 6', output_type=Output.DICT)

            if keyword not in results["text"]:
                numTries += 1
                print(f"Identifier '{keyword}' was not found. Trying again. This is attempt {numTries}. After {maxNumTries} attempts the script will stop. ")
                time.sleep(0.5)

            if numTries == maxNumTries:
                print(f"We have reached {maxNumTries} attempts. ")

                if justCaseCheck == True:
                    return      # would return None if it's not found. 
                                    # If we DID find it, then it would return the coordinates
                elif justCaseCheck == False:
                    input("We are either not on the correct page, or OCR has failed to find our keyword. Script will now exit. ")
                    exit()

        elif keyword in results["text"]:
            break
                        
    keywordToString = str(keyword) # changing keyword to String
    # Checking for the keyword in the results BEFORE WE MOVE ON
    if keywordToString not in results["text"]:
        print("Didn't find keyword on screen. Please check that the appropriate SAP page is open on the screen. " )
        input('Press ENTER to exit now. ')
        exit()
    
    return center_x, center_y

谁能提供一些关于为什么会发生这种情况的见解,以及我如何解决这个问题?

【问题讨论】:

    标签: python ocr tesseract python-tesseract


    【解决方案1】:

    我重新阅读了 PSM 为 PyTesseract 所做的事情,发现我之前误解了它。据我了解,似乎 --psm 6 意味着它正在寻找文本块,就像你寻找段落一样。我将它切换到 --psm 11 ,它只是寻找在这种情况下更适用的稀疏文本。

    【讨论】:

      猜你喜欢
      • 2013-03-03
      • 2023-04-07
      • 2011-01-24
      • 1970-01-01
      • 2014-11-22
      • 1970-01-01
      • 2021-08-05
      • 2018-03-20
      • 2021-04-05
      相关资源
      最近更新 更多