【发布时间】:2021-12-17 11:52:56
【问题描述】:
对于CUDA核函数,得到如下所示的分支散度,如何优化?
int gx = threadIdx.x + blockDim.x * blockIdx.x;
val = g_data[gx];
if (gx % 4 == 0)
val = op1(val);
else if (gx % 4 == 1)
val = op2(val);
else if (gx % 4 == 2)
val = op3(val);
else if (gx % 4 == 3)
val = op4(val);
g_data[gx] = val;
【问题讨论】:
标签: cuda