从图像中分割字符答案

【问题标题】：Segmenting characters from Image从图像中分割字符
【发布时间】：2014-12-03 22:26:35
【问题描述】：

我在分割以下车牌图像时遇到问题，在对以下图像进行阈值处理时，字符被分解为 1 个以上的字符。所以我得到了错误的 OCR 结果。我在对图像进行阈值处理后应用了形态学关闭操作，即使在那之后我也无法正确分割字符..

用于分割以上图像的代码如下

#include <iostream>
#include<cv.h>
#include<highgui.h>

using namespace std;
using namespace cv;
int main(int argc, char *argv[])
{
  IplImage *img1 = cvLoadImage(argv[1] , 0);
  IplImage *img2 = cvCloneImage(img1);

  cvNamedWindow("Orig"); 
  cvShowImage("Orig",img1);
  cvWaitKey(0);

  int wind = img1->height;
  if (wind % 2 == 0) wind += 1;

  cvAdaptiveThreshold(img1, img1, 255, CV_ADAPTIVE_THRESH_GAUSSIAN_C,
                      CV_THRESH_BINARY_INV, wind);

  IplImage* temp = cvCloneImage(img1);

  cvNamedWindow("Thre"); 
  cvShowImage("Thre",img1);
  cvWaitKey(0);

  IplConvKernel* kernal = cvCreateStructuringElementEx(3, 3, 1, 1,
                                                       CV_SHAPE_RECT,NULL);

  cvMorphologyEx(img1, img1, temp, kernal, CV_MOP_CLOSE, 1);

  cvNamedWindow("close"); 
  cvShowImage("close",img1);

  cvWaitKey(0);
}

下面给出的输出图像..

谁能提供一种从这些图像中分割字符的好方法...？？

【问题讨论】：

我不确定这是否会有所帮助 - stackoverflow.com/a/10970473/2380071 或者这个 - stackoverflow.com/a/14372743/2380071
膨胀和腐蚀对我不起作用..

标签： image opencv image-processing computer-vision

【解决方案1】：

我想展示一种快速而肮脏的方法来隔离板中的字母/数字，因为字符的实际分割不是问题。当这些是输入图像时：

这是你在我的算法结束时得到的：

因此，我在此答案中讨论的内容将为您提供一些想法，并帮助您摆脱当前分割过程结束时出现的伪影。请记住，这种方法应该只适用于这些类型的图像，如果您需要更强大的东西，您需要调整一些东西或想出全新的方法来做这些事情。

考虑到亮度的剧烈变化，最好执行histogram equalization 来提高对比度并使它们彼此更加相似，以便所有其他技术和参数都适用：

接下来，bilateral filter 可用于平滑图像，同时保留对象的边缘，这对于二值化过程很重要。这个过滤器需要更多的处理能力than others。

在图像准备好进行二值化之后，使用adaptive threshold 来达到目的：

二值化的结果和你实现的差不多，所以我想出了一个方法，使用findContours()来去除更小和更大的片段：

结果似乎好一点，但它破坏了盘子上字符的重要部分。但是，现在这并不是真正的问题，因为我们并不担心识别字符：我们只想隔离它们所在的区域。所以下一步是继续擦除段，更具体地说，是那些未与数字的同一 Y 轴对齐的段。在此切割过程中幸存下来的轮廓是：

这样好多了，此时会创建一个新的std::vector<cv::Point> 来存储绘制所有这些段所需的所有像素坐标。这是创建cv::RotatedRect 所必需的，这使我们能够创建bounding box 和crop the image：

从现在开始，您可以使用裁剪后的图像来执行您自己的技术并轻松分割车牌的字符。

这是 C++ 代码：

#include <iostream>
#include <vector>    
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgproc/imgproc_c.h>

/* The code has an outter loop where every iteration processes one of the four input images */

std::string files[] = { "plate1.jpg", "plate2.jpg", "plate3.jpg", "plate4.jpg" };
cv::Mat imgs[4];
for (int a = 0; a < 4; a++)
{
    /* Load input image */

    imgs[a] = cv::imread(files[a]);
    if (imgs[a].empty())
    {
        std::cout << "!!! Failed to open image: " << imgs[a] << std::endl;
        return -1;
    }

    /* Convert to grayscale */

    cv::Mat gray;
    cv::cvtColor(imgs[a], gray, cv::COLOR_BGR2GRAY);

    /* Histogram equalization improves the contrast between dark/bright areas */

    cv::Mat equalized;
    cv::equalizeHist(gray, equalized);
    cv::imwrite(std::string("eq_" + std::to_string(a) + ".jpg"), equalized);
    cv::imshow("Hist. Eq.", equalized);

    /* Bilateral filter helps to improve the segmentation process */

    cv::Mat blur;
    cv::bilateralFilter(equalized, blur, 9, 75, 75);
    cv::imwrite(std::string("filter_" + std::to_string(a) + ".jpg"), blur);
    cv::imshow("Filter", blur);

    /* Threshold to binarize the image */

    cv::Mat thres;
    cv::adaptiveThreshold(blur, thres, 255, cv::ADAPTIVE_THRESH_GAUSSIAN_C, cv::THRESH_BINARY, 15, 2); //15, 2
    cv::imwrite(std::string("thres_" + std::to_string(a) + ".jpg"), thres);
    cv::imshow("Threshold", thres);

    /* Remove small segments and the extremelly large ones as well */

    std::vector<std::vector<cv::Point> > contours;
    cv::findContours(thres, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);

    double min_area = 50;
    double max_area = 2000;
    std::vector<std::vector<cv::Point> > good_contours;
    for (size_t i = 0; i < contours.size(); i++)
    {
        double area = cv::contourArea(contours[i]);
        if (area > min_area && area < max_area)
            good_contours.push_back(contours[i]);
    }

    cv::Mat segments(gray.size(), CV_8U, cv::Scalar(255));
    cv::drawContours(segments, good_contours, -1, cv::Scalar(0), cv::FILLED, 4);
    cv::imwrite(std::string("segments_" + std::to_string(a) + ".jpg"), segments);
    cv::imshow("Segments", segments);

    /* Examine the segments that survived the previous lame filtering process
     * to figure out the top and bottom heights of the largest segments.
     * This info will be used to remove segments that are not aligned with
     * the letters/numbers of the plate.
     * This technique is super flawed for other types of input images.
     */

    // Figure out the average of the top/bottom heights of the largest segments
    int min_average_y = 0, max_average_y = 0, count = 0;
    for (size_t i = 0; i < good_contours.size(); i++)
    {
        std::vector<cv::Point> c = good_contours[i];
        double area = cv::contourArea(c);
        if (area > 200)
        {
            int min_y = segments.rows, max_y = 0;
            for (size_t j = 0; j < c.size(); j++)
            {
                if (c[j].y < min_y)
                    min_y = c[j].y;

                if (c[j].y > max_y)
                    max_y = c[j].y;
            }
            min_average_y += min_y;
            max_average_y += max_y;
            count++;
        }
    }
    min_average_y /= count;
    max_average_y /= count;
    //std::cout << "Average min: " << min_average_y << " max: " << max_average_y << std::endl;

    // Create a new vector of contours with just the ones that fall within the min/max Y
    std::vector<std::vector<cv::Point> > final_contours;
    for (size_t i = 0; i < good_contours.size(); i++)
    {
        std::vector<cv::Point> c = good_contours[i];
        int min_y = segments.rows, max_y = 0;
        for (size_t j = 0; j < c.size(); j++)
        {
            if (c[j].y < min_y)
                min_y = c[j].y;

            if (c[j].y > max_y)
                max_y = c[j].y;
        }

        // 5 is to add a little tolerance from the average Y coordinate
        if (min_y >= (min_average_y-5) && (max_y <= max_average_y+5))
            final_contours.push_back(c);
    }

    cv::Mat final(gray.size(), CV_8U, cv::Scalar(255));
    cv::drawContours(final, final_contours, -1, cv::Scalar(0), cv::FILLED, 4);
    cv::imwrite(std::string("final_" + std::to_string(a) + ".jpg"), final);
    cv::imshow("Final", final);


    // Create a single vector with all the points that make the segments
    std::vector<cv::Point> points;
    for (size_t x = 0; x < final_contours.size(); x++)
    {
        std::vector<cv::Point> c = final_contours[x];
        for (size_t y = 0; y < c.size(); y++)
            points.push_back(c[y]);
    }

    // Compute a single bounding box for the points
    cv::RotatedRect box = cv::minAreaRect(cv::Mat(points));
    cv::Rect roi;
    roi.x = box.center.x - (box.size.width / 2);
    roi.y = box.center.y - (box.size.height / 2);
    roi.width = box.size.width;
    roi.height = box.size.height;

    // Draw the box at on equalized image
    cv::Point2f vertices[4];
    box.points(vertices);
    for(int i = 0; i < 4; ++i)
        cv::line(imgs[a], vertices[i], vertices[(i + 1) % 4], cv::Scalar(255, 0, 0), 1, CV_AA);
    cv::imwrite(std::string("box_" + std::to_string(a) + ".jpg"), imgs[a]);
    cv::imshow("Box", imgs[a]);

    // Crop the equalized image with the area defined by the ROI
    cv::Mat crop = equalized(roi);
    cv::imwrite(std::string("crop_" + std::to_string(a) + ".jpg"), crop);
    cv::imshow("crop", crop);

    /* The cropped image should contain only the plate's letters and numbers.
     * From here on you can use your own techniques to segment the characters properly.
     */

    cv::waitKey(0);
}

有关使用 OpenCV 进行车牌识别的更完整、更强大的方法，请查看 Mastering OpenCV with Practical Computer Vision Projects，第 5 章。 Source code is available on Github!

【讨论】：

我们如何从最终图像中分割片段？（膨胀和形态闭合操作并不完美，因为在大多数车牌中，字符彼此非常接近）
在您提供的最终图像中，字符 Y、5、2 已断开连接。因此，使用 cvFindContour() 我们会将它们作为不同的段来获取。这会导致错误的 OCR 输出。
You can follow many different approaches for this，包括上面提到的章节。
我描述的技术是隔离文本区域。您不应该将这些中间结果用于文本识别。但是您可以使用我的一些想法来实现这一点，例如双边过滤、自适应阈值等。
我之前尝试过 SWT 和 OTSU 方法。但这些方法无法从上图中分割字符。