【发布时间】:2013-06-17 17:36:30
【问题描述】:
我目前正在使用具有 4 个 Tesla T10 gpu 的 gpu 服务器。当我不断测试内核并且不得不经常使用 ctrl-C 终止进程时,我在一个简单的设备查询代码的末尾添加了几行代码。代码如下:
#include <stdio.h>
// Print device properties
void printDevProp(cudaDeviceProp devProp)
{
printf("Major revision number: %d\n", devProp.major);
printf("Minor revision number: %d\n", devProp.minor);
printf("Name: %s\n", devProp.name);
printf("Total global memory: %u\n", devProp.totalGlobalMem);
printf("Total shared memory per block: %u\n", devProp.sharedMemPerBlock);
printf("Total registers per block: %d\n", devProp.regsPerBlock);
printf("Warp size: %d\n", devProp.warpSize);
printf("Maximum memory pitch: %u\n", devProp.memPitch);
printf("Maximum threads per block: %d\n", devProp.maxThreadsPerBlock);
for (int i = 0; i < 3; ++i)
printf("Maximum dimension %d of block: %d\n", i, devProp.maxThreadsDim[i]);
for (int i = 0; i < 3; ++i)
printf("Maximum dimension %d of grid: %d\n", i, devProp.maxGridSize[i]);
printf("Clock rate: %d\n", devProp.clockRate);
printf("Total constant memory: %u\n", devProp.totalConstMem);
printf("Texture alignment: %u\n", devProp.textureAlignment);
printf("Concurrent copy and execution: %s\n", (devProp.deviceOverlap ? "Yes" : "No"));
printf("Number of multiprocessors: %d\n", devProp.multiProcessorCount);
printf("Kernel execution timeout: %s\n", (devProp.kernelExecTimeoutEnabled ? "Yes" : "No"));
return;
}
int main()
{
// Number of CUDA devices
int devCount;
cudaGetDeviceCount(&devCount);
printf("CUDA Device Query...\n");
printf("There are %d CUDA devices.\n", devCount);
// Iterate through devices
for (int i = 0; i < devCount; ++i)
{
// Get device properties
printf("\nCUDA Device #%d\n", i);
cudaDeviceProp devProp;
cudaGetDeviceProperties(&devProp, i);
printDevProp(devProp);
}
printf("\nPress any key to exit...");
char c;
scanf("%c", &c);
**for (int i = 0; i < devCount; i++) {
cudaSetDevice(i);
cudaDeviceReset();
}**
return 0;
}
我的查询与 main() 结束之前的 for 循环有关,在该循环中我一一设置每个设备,然后使用 cudaResetDevice 命令。我有一种奇怪的感觉,这段代码虽然没有产生任何错误,但我无法重置所有设备。相反,程序每次只重置默认设备,即设备 0。谁能告诉我我应该怎么做才能重置 4 个设备中的每一个。
谢谢
【问题讨论】:
标签: cuda