gpuocelot 是否支持 CUDA 设备中的动态内存分配？答案

【问题标题】：Does gpuocelot support dynamic memory allocation in CUDA device?gpuocelot 是否支持 CUDA 设备中的动态内存分配？
【发布时间】：2013-01-03 21:53:02
【问题描述】：

我的算法（并行多前高斯消除）需要在 CUDA 内核内部动态分配内存（树构建）。有谁知道 gpuocelot 是否支持这些东西？

根据这个：stackoverflow-link 和 CUDA 编程指南我可以做这样的事情。但是使用 gpuocelot 我会在运行时出错。

错误：

当我在内核中调用 malloc() 时，我收到此错误：

(2.000239) ExternalFunctionSet.cpp:371: 断言消息：从 PTX 调用外部主机函数需要 LLVM。
求解器：ocelot/ir/implementation/ExternalFunctionSet.cpp:371: void ir::ExternalFunctionSet::ExternalFunction::call(void*, const ir::PTXKernel::Prototype&): Assertion false' failed.

当我尝试获取或设置 malloc 堆大小（在主机代码内）时：

求解器：ocelot/cuda/implementation/CudaRuntimeInterface.cpp:811: virtual cudaError_t cuda::CudaRuntimeInterface::cudaDeviceGetLimit(size_t *, cudaLimit): 断言 `0 && "unimplemented"' 失败。

也许我必须（以某种方式）指向我想使用设备malloc() 的编译器？

有什么建议吗？

【问题讨论】：

我有理由确定模拟器已经支持 malloc、free 和 printf，但我不太确定 LLVM 后端。你真的应该在 Ocelot 邮件列表上问这个问题。这根本不是一个真正的 CUDA 问题，我很想删除 CUDA 标记。

标签： cuda nvidia dynamic-memory-allocation gpu

【解决方案1】：

你可以在gpu ocelot邮件列表中找到答案：

gpuocelot mailing list link

【讨论】：