【发布时间】:2015-02-24 14:50:35
【问题描述】:
我正在尝试在装有 CUDA 6.5 的 Win 8.1 机器上构建 OpenCV 2.4.10。我还有其他第三方库,它们已成功安装。我运行了一个简单的基于 GPU 的程序,我得到了这个错误No GPU found or the library was compiled without GPU support。我还运行了在安装过程中构建的示例 exe 文件,例如 performance_gpu.exe,我得到了同样的错误。我还检查了 WITH_CUDA 标志。以下是在 CMAKE 构建期间设置的标志(与 CUDA 相关)。
- WITH_CUDA:已选中
- WITH_CUBLAS:已选中
- WITH_CUFFT:选中
- CUDA_ARCH_BIN : 1.1 1.2 1.3 2.0 2.1(2.0) 3.0 3.5
- CUDA_ARCH_PTX:3.0
- CUDA_FAST_MATH:已选中
- CUDA_GENERATION:自动
- CUDA_HOST_COMPILER : $(VCInstallDir)bin
- CUDA_SPERABLE_COMPILATION:未选中
- CUDA_TOOLKIT_ROOT_DIR : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v6.5
另一件事是,在我读过的一些帖子中,与 CUDA 一起构建需要花费大量时间。我的构建需要大约 3 小时,其中在编译 .cu 文件期间占用了最大时间。据我所知,在这些文件的编译过程中,我没有遇到任何错误。
在一些帖子中,我看到人们在build 目录中谈论目录名称gpu,但我没有看到任何内容!
我正在使用 Visual Studio 2013。
可能是什么问题?请帮忙!
更新:
我再次尝试构建 opencv,这次在开始构建之前我添加了 CUDA 的 bin、lib 和 include 目录。在E:\opencv\build\bin\Release 构建之后,我运行gpu_perf4au.exe 并得到了这个输出
[----------]
[ INFO ] Implementation variant: cuda.
[----------]
[----------]
[ GPU INFO ] Run test suite on GeForce GTX 860M GPU.
[----------]
Time compensation is 0
OpenCV version: 2.4.10
OpenCV VCS version: unknown
Build type: release
Parallel framework: tbb
CPU features: sse sse2 sse3 ssse3 sse4.1 sse4.2 avx avx2
[----------]
[ GPU INFO ] Run on OS Windows x64.
[----------]
*** CUDA Device Query (Runtime API) version (CUDART static linking) ***
Device count: 1
Device 0: "GeForce GTX 860M"
CUDA Driver Version / Runtime Version 6.50 / 6.50
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2048 MBytes (2147483648 bytes)
GPU Clock Speed: 1.02 GHz
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65536), 3
D=(4096,4096,4096)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16
384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 1 / 0
Compute Mode:
Default (multiple host threads can use ::cudaSetDevice() with device simul
taneously)
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.50, CUDA Runtime Ver
sion = 6.50, NumDevs = 1
我认为一切都很好,但是在运行这个程序后,我在其属性文件中包含了所有 opencv 和 CUDA 目录,
#include <cv.h>
#include <highgui.h>
#include <iostream>
#include <opencv2\opencv.hpp>
#include <opencv2\gpu\gpu.hpp>
using namespace std;
using namespace cv;
char key;
Mat thresholder (Mat input) {
gpu::GpuMat dst, src;
src.upload(input);
gpu::threshold(src, dst, 128.0, 255.0, CV_THRESH_BINARY);
Mat result_host(dst);
return result_host;
}
int main(int argc, char* argv[]) {
cvNamedWindow("Camera_Output", 1);
CvCapture* capture = cvCaptureFromCAM(CV_CAP_ANY);
while (1){
IplImage* frame = cvQueryFrame(capture);
IplImage* gray_frame = cvCreateImage(cvGetSize(frame), IPL_DEPTH_8U, 1);
cvCvtColor(frame, gray_frame, CV_RGB2GRAY);
Mat temp(gray_frame);
Mat thres_temp;
thres_temp = thresholder(temp);
//cvShowImage("Camera_Output", frame); //Show image frames on created window
imshow("Camera_Output", thres_temp);
key = cvWaitKey(10);
if (char(key) == 27){
break; //If you hit ESC key loop will break.
}
}
cvReleaseCapture(&capture);
cvDestroyWindow("Camera_Output");
return 0;
}
我得到了错误:
OpenCV Error: No GPU support (The library is compiled without CUDA support) in E
mptyFuncTable::mallocPitch, file C:\builds\2_4_PackSlave-win64-vc12-shared\openc
v\modules\dynamicuda\include\opencv2/dynamicuda/dynamicuda.hpp, line 126
【问题讨论】: