I have allocated memory for 3d array using cudaMalloc3D - after execution of first kernel I established that I do not need part of it. For example in pseudo code :

A = [100,100,100]
kernel()// data of intrest is just in subrange of A
B = [10:20, 20:100, 50:80]// part that I need other entries I would like to have removed
... // new allocations
kernelb()...

The rest of memory I would like to free (or immidiately use to other arrays that I will need to allocate now)

I know that I can free array and reallocate - but It do not seem to the best option. P.S.

By the way Is there a way to use cudaMallocAsync like cudaMalloc3D - I mean cudaMalloc3D makes it convienient to use 3d array and takes care for paddings.


The current CUDA API does not have realloc functionality. It seems you already know the common workaround of cudaMalloc smaller array -> cudaMemcpy to smaller array -> cudaFree large array

In case you really need realloc, you could write your own allocator using GPU virtual memory management. https://developer.nvidia.com/blog/introducing-low-level-gpu-virtual-memory-management/