【发布时间】:2016-03-07 20:28:55
【问题描述】:
我尝试使用cudaMallocPitch和cudaMemcpy2D,但是当我尝试将cudaMemcpy2D与大数组一起使用时,我遇到了问题:
分段错误
这是可运行的源代码,没有错误。
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <iostream>
#include <random>
#define ROW_SIZE 32
#define COL_SIZE 1024
int main()
{
float ** pfTest;
pfTest = (float**)malloc(ROW_SIZE * sizeof(float*));
for (int i = 0; i < ROW_SIZE; i++) {
pfTest[i] = (float*)malloc(COL_SIZE * sizeof(float));
}
std::default_random_engine generator;
std::uniform_real_distribution<float> distribution;
for (int y = 0; y < ROW_SIZE; y++) {
for (int x = 0; x < COL_SIZE; x++) {
pfTest[y][x] = distribution(generator);
}
}
float *dev_Test;
size_t pitch;
cudaMallocPitch(&dev_Test, &pitch, COL_SIZE * sizeof(float), ROW_SIZE);
cudaMemcpy2D(dev_Test, pitch, pfTest, COL_SIZE * sizeof(float), COL_SIZE * sizeof(float), ROW_SIZE, cudaMemcpyHostToDevice);
printf("%s\n", cudaGetErrorString(cudaGetLastError()));
return 0;
}
如您所见,完全没有问题。
但是,当我尝试将 COL_SIZE 扩展到 500,000 左右(确切地说,524288)时,它会因分段错误而崩溃。
关于问题根源的任何帮助?
【问题讨论】: