【问题标题】:Fixing incorrect boundingBox coordinates?修复不正确的boundingBox坐标?
【发布时间】:2021-10-17 10:25:57
【问题描述】:

我正在使用 Google ML 和 CameraX 为 Android(Java)开发一个对象检测应用程序。我也在使用 Tensorflow 模型,可以在 here 找到。我的问题是我的 boundingBox 的坐标有点错位,如下图所示。请忽略它被检测为抹刀的事实,我目前的问题集中在捕获屏幕上显示的图形中的图像。

这是用于绘制graphicOverlay的以下类;

DrawGraphic.java;

public class DrawGraphic extends View {

    Paint borderPaint, textPaint;
    Rect rect;
    String text;

    ImageProxy imageProxy;
    PreviewView previewView;


    public DrawGraphic(Context context, Rect rect, String text, ImageProxy imageProxy, PreviewView previewView) {
        super(context);
        this.rect = rect;
        this.text = text;

        borderPaint = new Paint();
        borderPaint.setColor(Color.WHITE);
        borderPaint.setStrokeWidth(10f);
        borderPaint.setStyle(Paint.Style.STROKE);

        textPaint = new Paint();
        textPaint.setColor(Color.WHITE);
        textPaint.setStrokeWidth(50f);
        textPaint.setTextSize(32f);
        textPaint.setStyle(Paint.Style.FILL);
    }

    @Override
    protected void onDraw(Canvas canvas) {
        super.onDraw(canvas);
        canvas.setMatrix(getMappingMatrix(imageProxy, previewView));
        canvas.concat(getMappingMatrix(imageProxy, previewView));
        canvas.drawText(text, rect.centerX(), rect.centerY(), textPaint);
        canvas.drawRect(rect.left, rect.bottom, rect.right, rect.top, borderPaint);

        ImageProxy imageProxy;
        PreviewView previewView;
    }

    Matrix getMappingMatrix(ImageProxy imageProxy, PreviewView previewView) {
        Rect cropRect = imageProxy.getCropRect();
        int rotationDegrees = imageProxy.getImageInfo().getRotationDegrees();
        Matrix matrix = new Matrix();

        float[] source = {
                cropRect.left,
                cropRect.top,
                cropRect.right,
                cropRect.top,
                cropRect.right,
                cropRect.bottom,
                cropRect.left,
                cropRect.bottom
        };

        float[] destination = {
                0f,
                0f,
                previewView.getWidth(),
                0f,
                previewView.getWidth(),
                previewView.getHeight(),
                0f,
                previewView.getHeight()
        };

        int vertexSize = 2;

        int shiftOffset = rotationDegrees / 90 * vertexSize;
        float[] tempArray = destination.clone();
        for (int toIndex = 0; toIndex < source.length; toIndex++) {
            int fromIndex = (toIndex + shiftOffset) % source.length;
            destination[toIndex] = tempArray[fromIndex];
        }
        matrix.setPolyToPoly(source, 0, destination, 0, 4);
        return matrix;
    }
}

MainActivity.java

public class MainActivity extends AppCompatActivity {

    private static final int PERMISSIONS_REQUEST = 1;

    private static final String PERMISSION_CAMERA = Manifest.permission.CAMERA;

    public static final Size DESIRED_PREVIEW_SIZE = new Size(640, 480);

    private PreviewView previewView;

    ActivityMainBinding binding;

    @Override
    protected void onCreate(@Nullable Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        binding = ActivityMainBinding.inflate(getLayoutInflater());
        setContentView(binding.getRoot());

        previewView = findViewById(R.id.previewView);

        if (hasPermission()) {
            // Start CameraX
            startCamera();
        } else {
            requestPermission();
        }
    }

    @SuppressLint("UnsafeOptInUsageError")
    private void startCamera() {
        ListenableFuture<ProcessCameraProvider> cameraProviderFuture = ProcessCameraProvider.getInstance(this);

        cameraProviderFuture.addListener(() -> {
            // Camera provider is now guaranteed to be available
            try {
                ProcessCameraProvider cameraProvider = cameraProviderFuture.get();

                // Set up the view finder use case to display camera preview
                Preview preview = new Preview.Builder().build();

                // Choose the camera by requiring a lens facing
                CameraSelector cameraSelector = new CameraSelector.Builder()
                        .requireLensFacing(CameraSelector.LENS_FACING_BACK)
                        .build();

                // Image Analysis
                ImageAnalysis imageAnalysis =
                        new ImageAnalysis.Builder()
                                .setTargetResolution(DESIRED_PREVIEW_SIZE)
                                .setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
                                .build();

                imageAnalysis.setAnalyzer(ContextCompat.getMainExecutor(this), imageProxy -> {
                    // Define rotation Degrees of the imageProxy
                    int rotationDegrees = imageProxy.getImageInfo().getRotationDegrees();
                    Log.v("ImageAnalysis_degrees", String.valueOf(rotationDegrees));

                    @SuppressLint("UnsafeExperimentalUsageError") Image mediaImage = imageProxy.getImage();
                    if (mediaImage != null) {
                        InputImage image =
                                InputImage.fromMediaImage(mediaImage, imageProxy.getImageInfo().getRotationDegrees());
                        //Pass image to an ML Kit Vision API
                        //...

                        LocalModel localModel =
                                new LocalModel.Builder()
                                        .setAssetFilePath("mobilenet_v1_0.75_192_quantized_1_metadata_1.tflite")
                                        .build();

                        CustomObjectDetectorOptions customObjectDetectorOptions =
                                new CustomObjectDetectorOptions.Builder(localModel)
                                        .setDetectorMode(CustomObjectDetectorOptions.STREAM_MODE)
                                        .enableClassification()
                                        .setClassificationConfidenceThreshold(0.5f)
                                        .setMaxPerObjectLabelCount(3)
                                        .build();

                        ObjectDetector objectDetector =
                                ObjectDetection.getClient(customObjectDetectorOptions);

                        objectDetector.process(image)
                                .addOnSuccessListener(detectedObjects -> {
                                    getObjectResults(detectedObjects);
                                    Log.d("TAG", "onSuccess" + detectedObjects.size());
                                    for (DetectedObject detectedObject : detectedObjects) {
                                        Rect boundingBox = detectedObject.getBoundingBox();

                                        Integer trackingId = detectedObject.getTrackingId();
                                        for (DetectedObject.Label label : detectedObject.getLabels()) {
                                            String text = label.getText();
                                            int index = label.getIndex();
                                            float confidence = label.getConfidence();
                                        }
                                    }
                                })
                                .addOnFailureListener(e -> Log.e("TAG", e.getLocalizedMessage()))
                                .addOnCompleteListener(result -> imageProxy.close());
                    }

                });

                // Connect the preview use case to the previewView
                preview.setSurfaceProvider(
                        previewView.getSurfaceProvider());

                // Attach use cases to the camera with the same lifecycle owner
                if (cameraProvider != null) {
                    Camera camera = cameraProvider.bindToLifecycle(
                            this,
                            cameraSelector,
                            imageAnalysis,
                            preview);
                }

            } catch (ExecutionException | InterruptedException e) {
                e.printStackTrace();
            }


        }, ContextCompat.getMainExecutor(this));
    }

    private void getObjectResults(List<DetectedObject> detectedObjects) {
        for (DetectedObject object : detectedObjects) {
            if (binding.parentlayout.getChildCount() > 1) {
                binding.parentlayout.removeViewAt(1);
            }
            Rect rect = object.getBoundingBox();
            String text = "Undefined";
            if (object.getLabels().size() != 0) {
                text = object.getLabels().get(0).getText();
            }

            DrawGraphic drawGraphic = new DrawGraphic(this, rect, text);
            binding.parentlayout.addView(drawGraphic);
        }
    }

    private boolean hasPermission() {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
            return checkSelfPermission(PERMISSION_CAMERA) == PackageManager.PERMISSION_GRANTED;
        } else {
            return true;
        }
    }

    private void requestPermission() {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
            if (shouldShowRequestPermissionRationale(PERMISSION_CAMERA)) {
                Toast.makeText(
                        this,
                        "Camera permission is required for this demo",
                        Toast.LENGTH_LONG)
                        .show();
            }
            requestPermissions(new String[]{PERMISSION_CAMERA}, PERMISSIONS_REQUEST);
        }
    }

    @Override
    public void onRequestPermissionsResult(
            final int requestCode, final String[] permissions, final int[] grantResults) {
        super.onRequestPermissionsResult(requestCode, permissions, grantResults);
        if (requestCode == PERMISSIONS_REQUEST) {
            if (allPermissionsGranted(grantResults)) {
                // Start CameraX
                startCamera();
            } else {
                requestPermission();
            }
        }
    }

    private static boolean allPermissionsGranted(final int[] grantResults) {
        for (int result : grantResults) {
            if (result != PackageManager.PERMISSION_GRANTED) {
                return false;
            }
        }
        return true;
    }
}

这一切都导致了我的问题,即为什么 boundingBox 稍微“关闭”。将根据要求提供补充此问题所需的任何进一步信息。

【问题讨论】:

标签: java graphics object-detection tensorflow-lite google-mlkit


【解决方案1】:

如模型说明中所述;

图像数据:ByteBuffer 大小为 192 x 192 x 3 x PIXEL_DEPTH,其中 PIXEL_DEPTH 对于浮点模型为 4,对于量化模型为 1。

确保您的media.Image 具有相同的分辨率。如果您提供不同的图像数据,这可能会导致错误的边界框和检测。这很可能是它最初被检测为抹刀的原因。 您可以设置 ImageAnalysis 配置以向您发送具有此分辨率的图像,或者您必须在将其作为模型输入之前调整图像大小。

请记住,输出边界框将根据 192 x 192 图像。现在您需要将此坐标转换为预览视图的坐标。为此,有很多解决方案,但您可以使用this

【讨论】:

  • 我不是这方面的专家,但我通过 Google 搜索找到了一些 API。也许您可以通过更新 DrawGraphic.java 中的 onDraw 逻辑来尝试使用矩阵。 developer.android.com/reference/android/graphics/…。或developer.android.com/reference/android/graphics/…
  • mlkit 演示应用中还有另一种解决方案,无需计算矩阵。您可以按照此处的代码github.com/googlesamples/mlkit/blob/… 进行深入研究。
  • 要使用您的 getMappingMatrix 方法,请尝试调用 canvas.setMatrix/concat(getMappingMatrix(imageProxy, previewView));正下方 super.onDraw(canvas);您需要先将 imageProxy 和 previewView 传递给 DrawGraphic 才能在此处使用它们。
  • 再次澄清一下,我不确定这种方法是否可行。我只是尝试遵循您现有的代码并提供一些提示来完成它。我建议的最好方法仍然是参考代码实验室并学习相关概念然后去做。鉴于此,对于您当前的代码,要在 onDraw 中调用 getMappingMatrix,您需要将方法声明从 MainActivity 文件移动到 DrawGraphic 文件。您还需要将 imageProxy 和 previewView 存储为局部变量,以便稍后在 onDraw 中使用它们。您要查找的 Matrix 是 getMappingMatrix 的返回值
  • 你得到 NullPointerException 的原因是你没有初始化它们。您应该将 imageProxy 和 previewView 作为 DrawGraphic 构造函数的参数传递,并用它们初始化两个局部变量。请注意,您还需要在 MainActivity.java 中的整个流程中传递这两个值,否则,您可能只是在不同的地方获得 NullPointer。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-08-03
  • 1970-01-01
  • 2014-08-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多