实时文本识别 (OCR)答案

【问题标题】：Live Text Recognition (OCR)实时文本识别 (OCR)
【发布时间】：2015-09-17 02:38:13
【问题描述】：

我想知道是否可以在不拍照的情况下在 iPhone 实时相机模式下操作 OCR？字母数字文本遵循可预测的或有时是固定的组合（例如序列号）。

我已经尝试过 OpenCV 和 Tesseract，但我不知道如何在实时摄像机源上进行一些图像处理。

我只是不知道我必须识别我期待的文本的部分！我可以使用其他库来完成这部分工作吗？

【问题讨论】：

【解决方案1】：

您可以使用 TesseractOCR 和 AVCaptureSession 来实现这一点。

@interface YourClass()
{
    BOOL canScanFrame;
    BOOL isScanning;
}
@property (strong, nonatomic) NSTimer *timer;

@end

@implementation YourClass
//...
- (void)prepareToScan
{
    //Prepare capture session, preview layer and so on
    //...

    self.timer = [NSTimer scheduledTimerWithTimeInterval:0.5 target:self selector:@selector(timerTicked) userInfo:nil repeats:YES];
}

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection;
{
    if (canScanFrame) {
        canScanFrame = NO;

        CGImageRef imageRef = [self imageFromSampleBuffer:sampleBuffer];
        UIImage *image = [UIImage imageWithCGImage:imageRef scale:1 orientation:UIImageOrientationRight];
        CGImageRelease(imageRef);

        [self.scanner setImage:image];

        isScanning = YES;
        dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
            NSLog(@"scan start");
            [self.scanner recognize];
            NSLog(@"scan stop");
            dispatch_async(dispatch_get_main_queue(), ^{
                isScanning = NO;
                NSString *text = [self.scanner recognizedText];
                //do something with text                     
            });
        });
    }
}

- (CGImageRef) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer // Create a CGImageRef from sample buffer data
{
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CVPixelBufferLockBaseAddress(imageBuffer,0);        // Lock the image buffer

    uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 0);   // Get information of the image
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    CGContextRef newContext = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
    CGImageRef newImage = CGBitmapContextCreateImage(newContext);
    CGContextRelease(newContext);

    CGColorSpaceRelease(colorSpace);
    CVPixelBufferUnlockBaseAddress(imageBuffer,0);

    return newImage;
}
- (void)timerTicked
{
    if (!isScanning) {
        canScanFrame = YES;
    }
}

@结束

【讨论】：

感谢您的回答！有用。您有什么建议可以最大限度地减少 CPU 使用率并提高准确性？您是否建议在发送到 tesseract 或使用 tesseract.rect 之前裁剪特定矩形上的图像？
是的，它会有所帮助。在发送到tesseract之前也可以尝试使图像灰度化，这样可以提高识别精度。
@arturdev self.scanner 是在哪里定义的？
@iajnr 可以在定时器下定义。
@arturdev self.scanner = [[G8Tesseract alloc] initWithLanguage:@"eng+ita"];好吗？