使用opencv检测W2中的单个框-python答案

【问题标题】：Detecting individual boxes in W2 with opencv - python使用opencv检测W2中的单个框-python
【发布时间】：2017-04-30 11:04:42
【问题描述】：

我进行了广泛的研究，但找不到可以满足我需要的技术组合。

我有一种情况，我需要对数百个 W2 执行 OCR 以提取数据以进行对账。 W2 的质量很差，因为它们是打印出来的，随后又被扫描回计算机。上述过程不在我的控制范围内；不幸的是，我必须使用我所拥有的。

去年我能够成功执行此过程，但我不得不强行执行此过程，因为及时性是一个主要问题。我通过手动指示要从中提取数据的坐标来做到这一点，然后一次只对这些段执行 OCR。今年，我想提出一个更动态的情况，预计坐标可能会改变，格式可能会改变等等。

我已经包含了一个示例，在下面擦洗了 W2。这个想法是让 W2 上的每个框成为自己的矩形，并通过遍历所有矩形来提取数据。我尝试了几种边缘检测技术，但没有一个能准确地提供所需的东西。我相信我还没有找到所需的正确预处理组合。我试图镜像一些数独谜题检测脚本。

这是我迄今为止所尝试的结果，以及 Python 代码，无论是 OpenCV 2 还是 3 都可以使用：

import cv2
import numpy as np

img = cv2.imread(image_path_here)

newx,newy = img.shape[1]/2,img.shape[0]/2
img = cv2.resize(img,(newx,newy))
blur = cv2.GaussianBlur(img, (3,3),5)
ret,thresh1 = cv2.threshold(blur,225,255,cv2.THRESH_BINARY)

gray = cv2.cvtColor(thresh1,cv2.COLOR_BGR2GRAY)

edges = cv2.Canny(gray,50,220,apertureSize = 3)

minLineLength = 20
maxLineGap = 50
lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength,maxLineGap)

for x1,y1,x2,y2 in lines[0]:
    cv2.line(img,(x1,y1),(x2,y2),(255,0,255),2)

cv2.imshow('hough',img)
cv2.waitKey(0)

【问题讨论】：

问题是这个参数很难检测到垂直线，尝试用lines_v = cv2.HoughLinesP(edges,1,np.pi,100,minLineLength,maxLineGap)找到垂直线并为此做另一个循环。为HoughLinesP函数尝试不同的参数值，也许为水平线和垂直线设置不同的值。

标签： python python-2.7 opencv computer-vision edge-detection

【解决方案1】：

如果您没有遵循我的代码中的任何内容，请告诉我。这个概念最大的缺陷是

1：（如果您在主框线中有嘈杂的中断，会将其分成单独的 blob）

2：如果这是可以手写文本的东西，但字母与框的边缘重叠可能会很糟糕。

3：它绝对不进行方向检查，（您实际上可能想要改进它，因为我认为它不会太糟糕并且会给您更准确的句柄）。我的意思是这取决于你的盒子与 xy 轴大致对齐，如果它们足够倾斜，它会给你所有盒子角的总偏移（尽管它仍然应该找到它们）

我稍微调整了阈值设置点以使所有文本与边缘分开，如果有必要，您可以在开始打破主线之前将其拉得更低。此外，如果您担心换行符，您可以将足够大的 blob 添加到最终图像中。

基本上，第一步是调整阈值，使其处于最稳定（可能仍然保持连接框的最低值）的截止值，用于将文本和噪声与框分离。

第二次找到最大的正 blob（应该是 boxgrid）。如果你的盒子没有保持在一起，你可能想要拿几个最高的斑点......虽然这会变得很粘，所以试着获得阈值，这样你就可以把它当作一个单一的斑点。

最后一步是获取矩形，为此，我只寻找负斑点（忽略第一个外部区域）。

这里是代码（很抱歉它是用 C++ 编写的，但希望你理解这个概念并且无论如何都会自己编写它）：

#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <iostream>
#include <stdio.h>
#include <opencv2/opencv.hpp>

using namespace cv;


//Attempts to find the largest connected group of points (assumed to be the interconnected boundaries of the textbox grid)
Mat biggestComponent(Mat targetImage, int connectivity=8)
{
    Mat inputImage;
    inputImage = targetImage.clone();
    Mat finalImage;// = inputImage;
    int greatestBlobSize=0;
    std::cout<<"Top"<<std::endl;
    std::cout<<inputImage.rows<<std::endl;
    std::cout<<inputImage.cols<<std::endl;

    for(int i=0;i<inputImage.cols;i++)
    {
        for(int ii=0;ii<inputImage.rows;ii++)
        {
            if(inputImage.at<uchar>(ii,i)!=0)
            {
                Mat lastImage;
                lastImage = inputImage.clone();
                Rect* boundbox;
                int blobSize = floodFill(inputImage, cv::Point(i,ii), Scalar(0),boundbox,Scalar(200),Scalar(255),connectivity);

                if(greatestBlobSize<blobSize)
                {
                    greatestBlobSize=blobSize;
                    std::cout<<blobSize<<std::endl;
                    Mat tempDif = lastImage-inputImage;
                    finalImage = tempDif.clone();
                }
                //std::cout<<"Loop"<<std::endl;
            }
        }
    }
    return finalImage;
}

//Takes an image that only has outlines of boxes and gets handles for each textbox.
//Returns a vector of points which represent the top left corners of the text boxes.
std::vector<Rect> boxCorners(Mat processedImage, int connectivity=4)
{
    std::vector<Rect> boxHandles;

    Mat inputImage;
    bool outerRegionFlag=true;

    inputImage = processedImage.clone();

    std::cout<<inputImage.rows<<std::endl;
    std::cout<<inputImage.cols<<std::endl;

    for(int i=0;i<inputImage.cols;i++)
    {
        for(int ii=0;ii<inputImage.rows;ii++)
        {
            if(inputImage.at<uchar>(ii,i)==0)
            {
                Mat lastImage;
                lastImage = inputImage.clone();
                Rect boundBox;

                if(outerRegionFlag) //This is to floodfill the outer zone of the page
                {
                    outerRegionFlag=false;
                    floodFill(inputImage, cv::Point(i,ii), Scalar(255),&boundBox,Scalar(0),Scalar(50),connectivity);
                }
                else
                {
                    floodFill(inputImage, cv::Point(i,ii), Scalar(255),&boundBox,Scalar(0),Scalar(50),connectivity);
                    boxHandles.push_back(boundBox);
                }
            }
        }
    }
    return boxHandles;
}

Mat drawTestBoxes(Mat originalImage, std::vector<Rect> boxes)
{
    Mat outImage;
    outImage = originalImage.clone();
    outImage = outImage*0; //really I am just being lazy, this should just be initialized with dimensions

    for(int i=0;i<boxes.size();i++)
    {
        rectangle(outImage,boxes[i],Scalar(255));
    }
    return outImage;
}

int main() {

    Mat image;
    Mat thresholded;
    Mat processed;

    image = imread( "Images/W2.png", 1 );
    Mat channel[3];

    split(image, channel);


    threshold(channel[0],thresholded,150,255,1);

    std::cout<<"Coputing biggest object"<<std::endl;
    processed = biggestComponent(thresholded);

    std::vector<Rect> textBoxes = boxCorners(processed);

    Mat finalBoxes = drawTestBoxes(image,textBoxes);


    namedWindow("Original", WINDOW_AUTOSIZE );
    imshow("Original", channel[0]);

    namedWindow("Thresholded", WINDOW_AUTOSIZE );
    imshow("Thresholded", thresholded);

    namedWindow("Processed", WINDOW_AUTOSIZE );
    imshow("Processed", processed);

    namedWindow("Boxes", WINDOW_AUTOSIZE );
    imshow("Boxes", finalBoxes);



    std::cout<<"waiting for user input"<<std::endl;

    waitKey(0);

    return 0;
}

【讨论】：

【解决方案2】：

嘿嘿，边缘检测不是唯一的方法。由于边缘足够厚（各处至少一个像素），二值化允许您对框内的区域进行单选。

通过简单的标准，您可以摆脱杂乱，仅边界框就可以为您提供相当好的分割。

【讨论】：

类似于 Sneaky Polar Bear 的方法。