opencl 如何与 opencv 一起工作答案

【问题标题】：How opencl work with opencvopencl 如何与 opencv 一起工作
【发布时间】：2021-12-13 12:01:23
【问题描述】：

我正在尝试使用 OpenCL 创建一个区域生长算法，因为我将使用 OpenCV 打开一个图像。问题是如何将数据转换成 OpenCL。

我在 Visual Studio 中使用 opencv 版本：2.4.9 和 opencl：AMD APP SDK\2.9-1

有人能告诉我使用opencv打开图片后我应该怎么做

【问题讨论】：

【解决方案1】：

在 OpenCL 应用程序中，通常有两种方法可以将图像（或任何其他数据）从主机程序传输到设备程序：1-使用缓冲区 2-使用 Image2d。他们都使用cl_mem 类型。因为使用缓冲区比使用 image2d 更简单（尤其是在灰度图像中），所以我解释了如何使用 OpenCL 中的缓冲区将图像从主机程序传输到设备。

openCV对象Mat读取输入图像后，转换为灰度图像。然后，我们使用方法clCreateBuffer 返回一个cl_mem 缓冲区。我们可以简单地将data（Mat 对象的属性）传递给clCreateBuffer，以通过输入图像数据初始化我们的输入内核缓冲区。然后我们可以使用clSetKernelArg 方法将创建的缓冲区传输到内核。最后，当内核完成它的工作时，我们可以通过clEnqueueReadBuffer 读取结果。

阅读 cmets 以了解此代码，不要犹豫提出问题。

主机代码：


// Make Contex, Kerenl and other requirements for OpenCL before this section....

Mat image = imread("logo.bmp", CV_LOAD_IMAGE_COLOR); // reading input image by opencv to Mat type
Mat input_;

cvtColor(image, input_, CV_BGR2GRAY); // convert input image to gray scale

cl_mem inputSignalBuffer = clCreateBuffer(
    context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, input_.rows *input_.cols *input_.elemSize(),
    static_cast<void *>(
        input_.data), // inputSignalBuffers will be initialized by input_.data which contains input image data
    &errNum);

cl_mem outputSignalBuffer =
    clCreateBuffer( // make and preparing an empty output buffer to use after opencl kernel call back
        context, CL_MEM_WRITE_ONLY, input_.rows *input_.cols *input_.elemSize(), NULL, &errNum);
checkErr(errNum, "clCreateBuffer(outputSignal)");

errNum = clSetKernelArg(
    kernel, 0, sizeof(cl_mem),
    &inputSignalBuffer); // passing input buffer and output buffer to kernel in order to be used on device
errNum |= clSetKernelArg(kernel, 1, sizeof(cl_mem), &maskBuffer);
errNum |= clSetKernelArg(kernel, 2, sizeof(cl_mem), &outputSignalBuffer);
errNum |= clSetKernelArg(kernel, 3, sizeof(cl_uint), &input_.rows);
errNum |= clSetKernelArg(kernel, 4, sizeof(cl_uint), &input_.cols);
errNum |= clSetKernelArg(kernel, 5, sizeof(cl_uint), &maskWidth);

size_t localWorkSize[2] = {16, 16}; // Using 2 dimensional range  with size of work group 16
size_t globalWorkSize[2] = {
    input_
        .rows, // Note: Global work size (input image rows and cols) should be multiple of size of work group.
    input_.cols};

// Queue the kernel up for execution across the array
errNum =
    clEnqueueNDRangeKernel( // enqueue kernel with enabling host blocking until finishing kernel execution
        queue, kernel, 2, NULL, globalWorkSize, localWorkSize, 0, NULL, NULL);
checkErr(errNum, "clEnqueueNDRangeKernel");

Mat output_ = cv::Mat(input_.rows, input_.cols, CV_8UC1);

errNum = clEnqueueReadBuffer( //  reading from ourput parameter of kernel
    queue, outputSignalBuffer, CL_TRUE, 0, input_.rows *input_.cols *input_.elemSize(),
    output_.data, // initialize OpenCV Mat by output_.data which contains output results of kernel
    0, NULL, NULL);
checkErr(errNum, "clEnqueueReadBuffer");

// cut the extra border spaces which has been added in the first part of the code in order to adjust image
// size with Work Group Size;

cv::imwrite("output.bmp", output_); // saving output in image file

内核代码：

__kernel void convolve(const __global uchar *const input, __constant uint *const mask,
                       __global uchar *const output, const int inputHeight, const int inputWidth,
                       const int maskWidth) {
  uint sum = 0;

  const int curr_x = get_global_id(0); // current curr_x (row)
  const int curr_y = get_global_id(1); // current curr_y (col)
  int d = maskWidth / 2;

  if (curr_x > d - 1 && curr_y > d - 1 && curr_x < inputHeight - d &&
      curr_y < inputWidth - d) // checking mask borders not to be out of input matrix
    for (int i = -d; i <= d; i++)
      for (int j = -d; j <= d; j++) {
        int mask_ptr =
            maskWidth * (i + d) + (j + d); // you can also use mad24(maskWidth, i+d, j+d) which is faster.
        sum += input[(curr_x + i) * inputWidth + curr_y + j] * mask[mask_ptr];
      }

  sum /= (maskWidth * maskWidth); // miangin gereftan

  sum = clamp(sum, (uint)0, (uint)255); // clamp == min(max(x, minval), maxval)

  output[curr_x * inputWidth + curr_y] = sum;
}

【讨论】：

嗨 Soheil Shababi 感谢您的回答，但我很困惑我在 opencl、opencv 或 opencv-cl 项目中编写此程序的位置，因为它们中的任何一个都在 Visual Studio 中有自己的集成，还有一些其他项目中不熟悉的词，例如领先和某些功能
对象？ Contex? Kerenl? miangin gereftan ↔ 平均？