Is there a CUDA smart pointer?

If not, what is the standard way to free up cudaMalloced memory when an exception is thrown? (Note that I am unable to use Thrust.)


Solution 1:

You can use RAII idiom and put your cudaMalloc() and cudaFree() calls to the constructor and destructor of your object respectively.

Once the exception is thrown your destructor will be called which will free the allocated memory.

If you wrap this object into a smart-pointer (or make it behave like a pointer) you will get your CUDA smart-pointer.

Solution 2:

You can use this custom cuda::shared_ptr implementation. As mentioned above, this implementation uses std::shared_ptr as a wrapper for CUDA device memory.

Usage Example:

std::shared_ptr<T[]> data_host =  std::shared_ptr<T[]>(new T[n]);
.
.
.

// In host code:
fun::cuda::shared_ptr<T> data_dev;
data_dev->upload(data_host.get(), n);
// In .cu file:
// data_dev.data() points to device memory which contains data_host;
 

This repository is indeed a single header file (cudasharedptr.h), so it will be easy to manipulate it if is necessary for your application.