Copy the contents of a 3D cudaArray obtained from an OpenGL texture

I would like to copy the contents of a 3D cudaArray originating from an OpenGL texture to a "classical" array and vice-versa.

Note: In the following snippets, errors checks are omitted for clarity.

The cudaArray is "allocated" this way:

cudaArray* texture_array {nullptr};
cudaGraphicsResource* resource{nullptr};

cudaGraphicsGLRegisterImage(&resource, texture.id, GL_TEXTURE_3D, cudaGraphicsRegisterFlagsNone);
cudaGraphicsMapResources(1, &resource, cuda_stream);
cudaGraphicsSubResourceGetMappedArray(&texture_array, resource, array_index, mipmap);

This operation is successful as I am able to obtain relevant information using cudaArrayGetInfo(&description, &extent, &flags, texture_array) and obtain things like the following example here with a 512 x 512 x 122 texture in format uint16.

//C-style pseudo-code

description
{
    .x = 16,
    .y = 0,
    .z = 0,
    .w = 0,
    .f = cudaChannelFormatKindUnsigned,
};

extent
{
    .width  = 512,
    .height = 512,
    .depth  = 122
};

flags = 0;

First try: linear array

After reading this answer to a post asking about pitched memory my first try was to use cudaMemcpy3D and simulate a pitched array with pitch being the row length in bytes like this:

std::uint8_t* linear_array{nullptr};

const cudaExtent extent =
{
    .width  = texture.width  * texture.pixel_format_byte_size,
    .height = texture.height,
    .depth  = texture.depth
};
cudaMalloc(&linear_array, extent.width * extent.height * extent.depth);

And then copy to it like that:

const cudaMemcpy3DParms copy_info =
{
    .srcArray = texture_array,
    .srcPos   =
    {
        .x = 0,
        .y = 0,
        .z = 0
    },
    .srcPtr =
    {
        .ptr   = nullptr,
        .pitch = 0, 
        .xsize = 0,
        .ysize = 0
    },

    .dstArray = nullptr,
    .dstPos   =
    {
        .x = 0,
        .y = 0,
        .z = 0
    },
    .dstPtr = 
    {
        .ptr   = linear_array,
        .pitch = extent.width, 
        .xsize = texture.width,
        .ysize = texture.height,
    }, 

    .extent = extent,
    .kind   = cudaMemcpyDefault,
};

cudaMemcpy3D(&copy_info)

The code above however produces a cudaErrorInvalidValue upon call to cudaMemcpy3D. Needless to say, the same thing happens if I reverse the two (source becomes destination and vice-versa).

Second try: pitched array

A bit more complicated for me as I intend to modify the data in a __global__ function, but whatever.

Similarly, I allocate a (real) pitched array like this:

cudaPitchedPtr ptr;
const cudaExtent extent =
{
    .width  = texture.width * texture.pixel_format_byte_size,
    .height = texture.height,
    .depth  = texture.depth,
};

cudaMalloc3D(&ptr, extent);

And copy to it like this:

const cudaMemcpy3DParms copy_info =
{
    .srcArray = texture_array,
    .srcPos   =
    {
        .x = 0,
        .y = 0,
        .z = 0
    },
    .srcPtr =
    {
        .ptr   = nullptr,
        .pitch = 0,
        .xsize = 0,
        .ysize = 0
    },

    .dstArray = nullptr,
    .dstPos   =
    {
        .x = 0,
        .y = 0,
        .z = 0
    },
    .dstPtr = ptr,

    .extent = extent,
    .kind = cudaMemcpyDefault
};

cudaMemcpy3D(&copy_info);

But I also got cudaErrorInvalidValue upon call to cudaMemcpy3D.


What am I doing wrong? Is a limitation of the API forbidding me to call cudaMemcpy3D when the array is a texture from a graphics API? If so, what can I do?


After various tests (copying to another cudaArray and other similar things), it appears the problem came from a misunderstanding.

The documentation clearly states:

" If a CUDA array is participating in the copy, the extent is defined in terms of that array's elements".

Thus, copy_info.extent has to be (in my context) the extent retrieved by cudaArrayGetInfo.