用于 OCR 的 iOS UIImage 二值化 - 处理具有不同亮度的图像答案

【问题标题】：iOS UIImage Binarization for OCR - handling images with varying luminance用于 OCR 的 iOS UIImage 二值化 - 处理具有不同亮度的图像
【发布时间】：2013-09-08 23:08:22
【问题描述】：

我有一个 C++ 二值化例程，用于以后的 OCR 操作。但是我发现它会产生不必要的文本倾斜。寻找替代品我发现GPUImage 很有价值，它解决了倾斜问题。

在应用 OCR 之前，我正在使用 GPUImage code like this 对输入图像进行二值化处理。

但是阈值不涵盖我得到的图像范围。从我的输入图像中查看两个样本：

我无法同时处理具有相同阈值的两者。低价值似乎可以稍后使用，较高的价值可以使用第一个。

第二张图像似乎特别复杂，因为无论我为阈值设置什么值，我都无法正确二值化所有字符。另一方面，我的 C++ 二值化例程似乎做对了，但我没有像 GPUImage 中的简单阈值那样对它进行实验的太多见解。

我应该如何处理？

更新：

我尝试使用默认乘数 = 1 的 GPUImageAverageLuminanceThresholdFilter。它适用于第一张图像，但第二张图像仍然存在问题。

一些更多样化的二值化输入：

更新二：

在通过this answer by Brad 之后，尝试了GPUImageAdaptiveThresholdFilter（也合并了 GPUImagePicture，因为之前我只在 UIImage 上应用它）。

有了这个，我得到了第二张完美的二值图像。然而，当我将模糊大小设置为 3.0 时，第一个在二值化后似乎有很多噪音。 OCR 会导致添加额外的字符。模糊大小的值越小，第二张图像就会失去精度。

这里是：

+(UIImage *)binarize : (UIImage *) sourceImage
{
    UIImage * grayScaledImg = [self toGrayscale:sourceImage];
    GPUImagePicture *imageSource = [[GPUImagePicture alloc] initWithImage:grayScaledImg];
    GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
    stillImageFilter.blurSize = 3.0;    

    [imageSource addTarget:stillImageFilter];   
    [imageSource processImage];        

    UIImage *imageWithAppliedThreshold = [stillImageFilter imageFromCurrentlyProcessedOutput];
  //  UIImage *destImage = [thresholdFilter imageByFilteringImage:grayScaledImg];
    return imageWithAppliedThreshold;
}

【问题讨论】：

您的 C++ 二值化例程是什么样的？也许这可以适应框架内的自定义过滤器。是局部自适应二值化还是全局阈值化？
C++ 例程所做的是灰度化 + 二值化。至于 GPUImage，我自己做灰度，然后将输出传递给 GPUImage 过滤器。我使用 stackoverflow 上提到的许多灰度技术之一。你想让我在这里提一下吗？基本上我使用 3 个不同的程序来做到这一点，但结果差别不大，所以我觉得这无关紧要。
这不是我所说的我的 C++ 例程，它是由其他人提供的，我不能在这里完全分享它，也无法总结它是如何工作的，因为我对它没有太多见解这个怎么运作。它相当复杂。我给你描述的都是我从里面的cmets中推导出来的。
我有一个商业但便宜的 iOS 二进制化代码。你能给我一个你想要二值化的硬图像样本，以便我试试吗？
@BradLarson 能否请您查看最终更新并建议我如何最好地使用 GPUImage？

标签： ios image-processing ocr gpuimage

【解决方案1】：

对于预处理步骤，您需要在这里adaptive thresholding。

我使用opencv 灰度和自适应阈值方法得到了这些结果。也许加上低通噪声过滤（高斯或中值）它应该像一个魅力。

我使用provisia（它是一个帮助您快速处理图像的用户界面）来获得我需要的块大小：43 用于您在此处提供的图像。如果您从更近或更远的位置拍摄照片，块大小可能会发生变化。如果你想要一个通用算法，你需要开发一个应该搜索最佳大小的算法（搜索直到检测到数字）

编辑：我刚刚看到最后一张图片。它小得无法治疗。即使你应用了最好的预处理算法，你也不会检测到这些数字。采样不是解决方案，因为会出现噪音。

【讨论】：

+1 表示效果很好。是的，这就是我需要的品质。不过，我是 Opencv 的新手，如果有您可以指出的原生 iOS 代码示例，那将非常有帮助。
好吧，我已经提供了链接，但在这里。 docs.opencv.org/doc/tutorials/ios/table_of_content_ios/…。首先尝试将 opencv 与您的项目链接并运行 hello world 示例。之后查看图像处理示例。您需要做的就是 uimagecv::mat 转换，实际上代码已经存在。然后使用“cvtColor”和“adaptiveThreshold”方法。这些也记录在那里。 docs.opencv.org/modules/imgproc/doc/…
cse.iitk.ac.in/users/vision/dipakmj/papers/… 您也可以查看第 139 页（或 pdf 中的第 155 页）。
缺少的是项目代码的链接。我目前的困境不允许我从头开始理解它。如果必须的话，我可以花费大部分时间来试验值以使其准确，而不是从头开始设置。这就是我选择 GPUImage 而不是 core Image 的原因之一。
stackoverflow.com/questions/10688672/… 另一个可能有用的链接。

【解决方案2】：

我终于自己探索了，这是我使用 GPUImage 过滤器的结果：

+ (UIImage *) doBinarize:(UIImage *)sourceImage
{
    //first off, try to grayscale the image using iOS core Image routine
    UIImage * grayScaledImg = [self grayImage:sourceImage];
    GPUImagePicture *imageSource = [[GPUImagePicture alloc] initWithImage:grayScaledImg];
    GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
    stillImageFilter.blurSize = 8.0;

    [imageSource addTarget:stillImageFilter];
    [imageSource processImage];

    UIImage *retImage = [stillImageFilter imageFromCurrentlyProcessedOutput];
    return retImage;
}

+ (UIImage *) grayImage :(UIImage *)inputImage
{    
    // Create a graphic context.
    UIGraphicsBeginImageContextWithOptions(inputImage.size, NO, 1.0);
    CGRect imageRect = CGRectMake(0, 0, inputImage.size.width, inputImage.size.height);

    // Draw the image with the luminosity blend mode.
    // On top of a white background, this will give a black and white image.
    [inputImage drawInRect:imageRect blendMode:kCGBlendModeLuminosity alpha:1.0];

    // Get the resulting image.
    UIImage *outputImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    return outputImage;
}

我使用这个实现了近 90% - 我确信肯定有更好的选择，但我尽可能尝试使用 blurSize，并且 8.0 是适用于大多数 我的输入图像.

对于其他人，祝你努力！

【讨论】：

【解决方案3】：

SWIFT3

解决方案 1

extension UIImage {

func doBinarize() -> UIImage? {

    let grayScaledImg = self.grayImage()
    let imageSource = GPUImagePicture(image: grayScaledImg)
    let stillImageFilter = GPUImageAdaptiveThresholdFilter()
    stillImageFilter.blurRadiusInPixels = 8.0 

    imageSource!.addTarget(stillImageFilter)
    stillImageFilter.useNextFrameForImageCapture()
    imageSource!.processImage()


    guard let retImage: UIImage = stillImageFilter.imageFromCurrentFramebuffer(with: UIImageOrientation.up) else {
        print("unable to obtain UIImage from filter")
        return nil
    }

    return retImage
}

func grayImage() -> UIImage? {
    UIGraphicsBeginImageContextWithOptions(self.size, false, 1.0)
    let imageRect = CGRect(x: 0, y: 0, width: self.size.width, height: self.size.height)

    self.draw(in: imageRect, blendMode: .luminosity, alpha:  1.0)

    let outputImage = UIGraphicsGetImageFromCurrentImageContext()
    UIGraphicsEndImageContext()

    return outputImage
}


}

结果是

解决方案 2

使用GPUImageLuminanceThresholdFilter实现无灰色100%黑白效果

   let stillImageFilter = GPUImageLuminanceThresholdFilter() 
   stillImageFilter.threshold = 0.9

例如，我需要检测闪光灯，这对我有用

【讨论】：