【问题标题】:Tesseract: How to export text and boundingboxes?Tesseract:如何导出文本和边界框?
【发布时间】:2012-07-07 07:42:30
【问题描述】:

我想将文档图像转换为 XML,并导出在页面中找到某个单词的位置。为了访问bounding box信息,可以使用tesseract的布局分析:

 tess.SetImage(...); 
 tess.SetPageSegMode( tesseract::PSM_AUTO_OSD); 
 tesseract::PageIterator* it = tess.AnalyseLayout(); 
 while(it->Next(tesseract::RIL_WORD)
 {
      int top, bottom, left, right; 
      it->BoundingBox(tesseract::RIL_WORD, &left, &top, &right, &bottom); 

 }

然而,此时我不知道边界框的实际内容,通过执行以下代码,对当前图像执行 OCR,因此text 包含页面的整个文本。

 tess.Recognize(0); 
 std::string text = tess.GetUTF8Text(); 

目前我将所有边界框临时存储在vector 中。对于每个框,我从原始框中剪切出一个子图像,并对每个边界框执行 OCR。基本上这是可行的,但是当我将结果与 Tesseract 命令行工具进行比较时,会发生更多的 OCR 错误。

因此,我想知道如何逐字遍历 OCR 结果并获得相应的边界框。

【问题讨论】:

    标签: c++ iteration ocr tesseract bounding-box


    【解决方案1】:
    tess.Recognize(0);
    
    PAGE_RES_IT resultIter(page_res_);
    
    for (resultIter.start_page(false); resultIter.block() != NULL; resultIter.forward()) 
    {
    
                WERD_RES* wordResult = resultIter.word();
                WERD_CHOICE* word = wordResult->best_choice;
    
                TBOX& box = wordResult->word->bounding_box();
    }
    

    【讨论】:

    • 您似乎使用的是旧版 Tesseract。在 v3.01 中,page_res_ 无法访问,而必须使用 tesseract::ResultIterator* it = tess.GetIterator()
    【解决方案2】:
    NSString *retText = @"";
    tesseract::ResultIterator *ri = tess.GetIterator();
    tesseract::PageIteratorLevel level = tesseract::RIL_WORD;
    
    if (ri != 0) {
    do {
      const char *word = ri->GetUTF8Text(level);
      float conf = ri->Confidence(level);
      int x1, y1, x2, y2;
      ri->BoundingBox(level, &x1, &y1, &x2, &y2);
    
      if (word) {
        printf("word: '%s';  \tconf: %.2f; BoundingBox: %d,%d,%d,%d;\n", word,
               conf, x1, y1, x2, y2);
    
        NSString *temp =
            [NSString stringWithCString:word encoding:NSUTF8StringEncoding];
        retText = [NSString stringWithFormat:@"%@ %@", retText, temp];
        retText = [retText stringByReplacingOccurrencesOfString:@"[\\\""
                                                     withString:@""];
        retText = [retText stringByReplacingOccurrencesOfString:@"\n\n"
                                                     withString:@""];
    
        UIBezierPath *path = [UIBezierPath bezierPath];
    
        [path moveToPoint:CGPointMake(x1, y1)];
        [path addLineToPoint:CGPointMake(x2, y1)];
        [path addLineToPoint:CGPointMake(x2, y2)];
        [path addLineToPoint:CGPointMake(x1, y2)];
        [path addLineToPoint:CGPointMake(x1, y1)];
    
        CAShapeLayer *shapeLayer = [CAShapeLayer layer];
        shapeLayer.path = [path CGPath];
        shapeLayer.strokeColor = [[UIColor blueColor] CGColor];
        shapeLayer.lineWidth = 3.0;
        shapeLayer.fillColor = [[UIColor clearColor] CGColor];
    
        [self.scrollView.layer addSublayer:shapeLayer];
    
        delete[] word;
      }
    } while (ri->Next(level));
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-12-05
      • 2015-04-19
      • 1970-01-01
      • 2021-11-26
      • 2021-12-03
      • 1970-01-01
      • 2020-12-16
      • 2019-11-23
      相关资源
      最近更新 更多