【问题标题】:Swift iOS - Vision framework text recognition and rectanglesSwift iOS - 视觉框架文本识别和矩形
【发布时间】:2022-08-18 16:21:31
【问题描述】:

我试图在使用 Vision 框架找到的文本区域上绘制矩形,但它们总是有点偏离。我这样做是这样的:

    public func drawOccurrencesOnImage(_ occurrences: [CGRect], _ image: UIImage) -> UIImage? {

    UIGraphicsBeginImageContextWithOptions(image.size, false, 0.0)

    image.draw(at: CGPoint.zero)
    let currentContext = UIGraphicsGetCurrentContext()

    currentContext?.addRects(occurrences)
    currentContext?.setStrokeColor(UIColor.red.cgColor)
    currentContext?.setLineWidth(2.0)
    currentContext?.strokePath()

    guard let drawnImage = UIGraphicsGetImageFromCurrentImageContext() else { return UIImage() }

    UIGraphicsEndImageContext()
    return drawnImage
}

但是返回的图像总是看起来差不多,但并不真正正确:

这就是我创建盒子的方式,与 Apple 完全相同:

        let boundingRects: [CGRect] = observations.compactMap { observation in

        guard let candidate = observation.topCandidates(1).first else { return .zero }

        let stringRange = candidate.string.startIndex..<candidate.string.endIndex
        let boxObservation = try? candidate.boundingBox(for: stringRange)

        let boundingBox = boxObservation?.boundingBox ?? .zero

        return VNImageRectForNormalizedRect(boundingBox,
                                            Int(UIViewController.chosenImage?.width ?? 0),
                                            Int(UIViewController.chosenImage?.height ?? 0))
    }

(来源:https://developer.apple.com/documentation/vision/recognizing_text_in_images

谢谢你。

标签: ios swift uiview vision text-recognition


【解决方案1】:

VNImageRectForNormalizedRect 返回 CGRecty 坐标翻转。 (macOS 和 iOS 使用不同的坐标系)。

相反,我可能会建议改编自Detecting Objects in Still ImagesboundingBox 版本:

fileprivate func boundingBox(forRegionOfInterest: CGRect, withinImageBounds bounds: CGRect) -> CGRect {
    let imageWidth = bounds.width
    let imageHeight = bounds.height

    // Begin with input rect.
    var rect = forRegionOfInterest

    // Reposition origin.
    rect.origin.x *= imageWidth
    rect.origin.x += bounds.origin.x
    rect.origin.y = (1 - rect.origin.y - rect.height) * imageHeight + bounds.origin.y

    // Rescale normalized coordinates.
    rect.size.width *= imageWidth
    rect.size.height *= imageHeight

    return rect
}

就我而言,这产生了正确的盒子:


例如。

let request = VNDetectTextRectanglesRequest { [self] request, error in
    guard let results = request.results, error == nil else { return }

    let rects = results
        .compactMap { $0 as? VNTextObservation }
        .map { boundingBox(forRegionOfInterest: $0.boundingBox, withinImageBounds: CGRect(origin: .zero, size: size)) }

    let format = UIGraphicsImageRendererFormat()
    format.scale = 1
    let finalImage = UIGraphicsImageRenderer(bounds: bounds, format: format).image { _ in
        image.draw(in: bounds)
        UIColor.red.setStroke()
        for rect in rects {
            let path = UIBezierPath(rect: rect)
            path.lineWidth = 5
            path.stroke()
        }
    }
    DispatchQueue.main.async { [self] in
        imageView.image = finalImage
    }
}

【讨论】:

    猜你喜欢
    • 2018-02-25
    • 2018-02-13
    • 2019-02-02
    • 1970-01-01
    • 2021-05-27
    • 1970-01-01
    • 2019-11-02
    • 2021-12-30
    • 2020-10-22
    相关资源
    最近更新 更多