Copy the contents of a 3D cudaArray obtained from an OpenGL texture
I would like to copy the contents of a 3D cudaArray
originating from an OpenGL texture to a "classical" array and vice-versa.
Note: In the following snippets, errors checks are omitted for clarity.
The cudaArray
is "allocated" this way:
cudaArray* texture_array {nullptr};
cudaGraphicsResource* resource{nullptr};
cudaGraphicsGLRegisterImage(&resource, texture.id, GL_TEXTURE_3D, cudaGraphicsRegisterFlagsNone);
cudaGraphicsMapResources(1, &resource, cuda_stream);
cudaGraphicsSubResourceGetMappedArray(&texture_array, resource, array_index, mipmap);
This operation is successful as I am able to obtain relevant information using
cudaArrayGetInfo(&description, &extent, &flags, texture_array)
and obtain things like the following example here with a 512 x 512 x 122
texture in format uint16
.
//C-style pseudo-code
description
{
.x = 16,
.y = 0,
.z = 0,
.w = 0,
.f = cudaChannelFormatKindUnsigned,
};
extent
{
.width = 512,
.height = 512,
.depth = 122
};
flags = 0;
First try: linear array
After reading this answer to a post asking about pitched memory my first try was to use cudaMemcpy3D
and simulate a pitched array with pitch
being the row length in bytes like this:
std::uint8_t* linear_array{nullptr};
const cudaExtent extent =
{
.width = texture.width * texture.pixel_format_byte_size,
.height = texture.height,
.depth = texture.depth
};
cudaMalloc(&linear_array, extent.width * extent.height * extent.depth);
And then copy to it like that:
const cudaMemcpy3DParms copy_info =
{
.srcArray = texture_array,
.srcPos =
{
.x = 0,
.y = 0,
.z = 0
},
.srcPtr =
{
.ptr = nullptr,
.pitch = 0,
.xsize = 0,
.ysize = 0
},
.dstArray = nullptr,
.dstPos =
{
.x = 0,
.y = 0,
.z = 0
},
.dstPtr =
{
.ptr = linear_array,
.pitch = extent.width,
.xsize = texture.width,
.ysize = texture.height,
},
.extent = extent,
.kind = cudaMemcpyDefault,
};
cudaMemcpy3D(©_info)
The code above however produces a cudaErrorInvalidValue
upon call to cudaMemcpy3D
.
Needless to say, the same thing happens if I reverse the two (source becomes destination and vice-versa).
Second try: pitched array
A bit more complicated for me as I intend to modify the data in a __global__
function, but whatever.
Similarly, I allocate a (real) pitched array like this:
cudaPitchedPtr ptr;
const cudaExtent extent =
{
.width = texture.width * texture.pixel_format_byte_size,
.height = texture.height,
.depth = texture.depth,
};
cudaMalloc3D(&ptr, extent);
And copy to it like this:
const cudaMemcpy3DParms copy_info =
{
.srcArray = texture_array,
.srcPos =
{
.x = 0,
.y = 0,
.z = 0
},
.srcPtr =
{
.ptr = nullptr,
.pitch = 0,
.xsize = 0,
.ysize = 0
},
.dstArray = nullptr,
.dstPos =
{
.x = 0,
.y = 0,
.z = 0
},
.dstPtr = ptr,
.extent = extent,
.kind = cudaMemcpyDefault
};
cudaMemcpy3D(©_info);
But I also got cudaErrorInvalidValue
upon call to cudaMemcpy3D
.
What am I doing wrong?
Is a limitation of the API forbidding me to call cudaMemcpy3D
when the array is a texture from a graphics API? If so, what can I do?
After various tests (copying to another cudaArray
and other similar things), it appears the problem came from a misunderstanding.
The documentation clearly states:
" If a CUDA array is participating in the copy, the extent is defined in terms of that array's elements".
Thus, copy_info.extent
has to be (in my context) the extent retrieved by cudaArrayGetInfo
.