【发布时间】:2015-07-24 12:05:09
【问题描述】:
我使用双线性插值实现了一个 CUDA 函数来调整图像大小。该函数应该给出正确的结果(视觉上),直到我在一个小矩阵上进行测试以检查输出图像的确切值。我得到的结果与 OpenCV 和 MATLAB 的结果不同。我在我的算法中找不到任何明显的缺陷。有人可以帮我解决这个问题吗?
双线性插值器函数:
texture<float, cudaTextureType2D> tex32f;
//Device function
__device__ float blinterp(const float xIndex, const float yIndex)
{
//floor the coordinates to get to the nearest valid pixel
const int intX = static_cast<int>(xIndex);
const int intY = static_cast<int>(yIndex);
//Set weights of pixels according to distance from actual location
const float a = xIndex - intX;
const float b = yIndex - intY;
/* _____________________
*| | |
*|(1-a)(1-b)| (a)(1-b) |
*|__________|__________|
*| | |
*| (1-a)(b) | (a)(b) |
*|__________|__________|
*/
//Compute the weighted average of 4 nearest pixels
float out = (1 - a) * (1 - b) * tex2D(tex32f, intX,intY)
+ (a) * (1 - b) * tex2D(tex32f,intX + 1,intY)
+ (1 - a) * (b) * tex2D(tex32f, intX,intY + 1)
+ (a * b) * tex2D(tex32f,intX + 1,intY + 1);
return out;
}
调整内核大小:
__global__ void kernel_resize(float* dst, int dstWidth, int dstHeight, int dstPitch, float xScale, float yScale)
{
const int xIndex = blockIdx.x * blockDim.x + threadIdx.x;
const int yIndex = blockIdx.y * blockDim.y + threadIdx.y;
if(xIndex>=dstWidth || yIndex>=dstHeight) return;
const unsigned int tid = yIndex * dstPitch + xIndex;
const float inXindex = xIndex * xScale;
const float inYindex = yIndex * yScale;
dst[tid] = blinterp(inXindex,inYindex);
}
包装函数:
int resize_32f_c1(float* src,float* dst,int srcWidth,int srcHeight, int srcPitch, int dstWidth,int dstHeight,int dstPitch)
{
if((srcWidth == dstWidth) && (srcHeight == dstHeight))
{
cudaMemcpy2D(dst,dstPitch,src,srcPitch,srcWidth * sizeof(float),srcHeight,cudaMemcpyDeviceToDevice);
return 0;
}
cudaBindTexture2D(NULL,tex32f,src,srcWidth,srcHeight,srcPitch);
dim3 Block(16,16);
dim3 Grid((dstWidth + Block.x - 1)/Block.x, (dstHeight + Block.y - 1)/Block.y);
float x = (float)(srcWidth)/(float)dstWidth;
float y = (float)(srcHeight)/(float)dstHeight;
kernel_resize<<<Grid,Block>>>(dst,dstWidth,dstHeight,dstPitch/sizeof(float),x,y);
cudaUnbindTexture(tex32f);
return 0;
}
结果(缩小 2):
输入(10 x 10):
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
MATLAB 和 OpenCV 输出:
0 0 0 0 0
0 0.25 0.5 0.25 0
0 0.5 1 0.5 0
0 0.25 0.5 0.25 0
0 0 0 0 0
我的输出:
0 0 0 0 0
0 0 0 0 0
0 0 1 1 0
0 0 1 1 0
0 0 0 0 0
【问题讨论】:
-
为什么将整数坐标传递给
tex2D,为什么坐标没有正确居中? -
@talonmies... 我尝试将
tex2D的参数更改为float。它没有效果。什么是体素中心坐标?我该怎么做? -
@RobertCrovella... 实际上我正在尝试进行全精度插值,而不是使用 CUDA 的内置 9 位插值。另外,根据@talonmies 的建议,我尝试通过将
intX + 0.5f和intY + 0.5f传递给tex2D来使像素坐标居中,但我又得到了相同的结果。 -
我删除了我的答案和 cmets,因为它们是基于对您的代码的错误假设。
标签: image-processing cuda image-resizing