Passing a list of values to fragment shader
I want to send a list of values into a fragment shader. It is a possibly large (couple of thousand items long) list of single precision floats. The fragment shader needs random access to this list and I want to refresh the values from the CPU on each frame.
I'm considering my options on how this could be done:
As a uniform variable of array type ("uniform float x[10];"). But there seems to be limits here, on my GPU sending more than a few hundred values is very slow and also I'd have to hard-code the upper limit in the shader when I'd rather would like to change that in runtime.
As a texture with height 1 and width of my list, then refresh the data using glCopyTexSubImage2D.
Other methods? I haven't kept up with all the changes in the GL-specification lately, perhaps there is some other method that is specifically designed for this purpose?
Solution 1:
There are currently 4 ways to do this: standard 1D textures, buffer textures, uniform buffers, and shader storage buffers.
1D Textures
With this method, you use glTex(Sub)Image1D
to fill a 1D texture with your data. Since your data is just an array of floats, your image format should be GL_R32F
. You then access it in the shader with a simple texelFetch
call. texelFetch
takes texel coordinates (hence the name), and it shuts off all filtering. So you get exactly one texel.
Note: texelFetch
is 3.0+. If you want to use prior GL versions, you will need to pass the size to the shader and normalize the texture coordinate manually.
The main advantages here are compatibility and compactness. This will work on GL 2.1 hardware (using the notation). And you don't have to use GL_R32F
formats; you could use GL_R16F
half-floats. Or GL_R8
if your data is reasonable for a normalized byte. Size can mean a lot for overall performance.
The main disadvantage is the size limitation. You are limited to having a 1D texture of the max texture size. On GL 3.x-class hardware, this will be around 8,192, but is guaranteed to be no less than 4,096.
Uniform Buffer Objects
The way this works is that you declare a uniform block in your shader:
layout(std140) uniform MyBlock
{
float myDataArray[size];
};
You then access that data in the shader just like an array.
Back in C/C++/etc code, you create a buffer object and fill it with floating-point data. Then, you can associate that buffer object with the MyBlock
uniform block. More details can be found here.
The principle advantages of this technique are speed and semantics. Speed is due to how implementations treat uniform buffers compared to textures. Texture fetches are global memory accesses. Uniform buffer accesses are generally not; the uniform buffer data is usually loaded into the shader when the shader is initialized upon its use in rendering. From there, it is a local access, which is much faster.
Semantically, this is better because it isn't just a flat array. For your specific needs, if all you need is a float[]
, that doesn't matter. But if you have a more complex data structure, the semantics can be important. For example, consider an array of lights. Lights have a position and a color. If you use a texture, your code to get the position and color for a particular light looks like this:
vec4 position = texelFetch(myDataArray, 2*index);
vec4 color = texelFetch(myDataArray, 2*index + 1);
With uniform buffers, it looks just like any other uniform access. You have named members that can be called position
and color
. So all the semantic information is there; it's easier to understand what's going on.
There are size limitations for this as well. OpenGL requires that implementations provide at least 16,384 bytes for the maximum size of uniform blocks. Which means, for float arrays, you get only 4,096 elements. Note again that this is the minimum required from implementations; some hardware can offer much larger buffers. AMD provides 65,536 on their DX10-class hardware, for example.
Buffer Textures
These are kind of a "super 1D texture". They effectively allow you to access a buffer object from a texture unit. Though they are one-dimensional, they are not 1D textures.
You can only use them from GL 3.0 or above. And you can only access them via the texelFetch
function.
The main advantage here is size. Buffer textures can generally be pretty gigantic. While the spec is generally conservative, mandating at least 65,536 bytes for buffer textures, most GL implementations allow them to range in the megabytes in size. Indeed, usually the maximum size is limited by the GPU memory available, not hardware limits.
Also, buffer textures are stored in buffer objects, not the more opaque texture objects like 1D textures. This means you can use some buffer object streaming techniques to update them.
The main disadvantage here is performance, just like with 1D textures. Buffer textures probably won't be any slower than 1D textures, but they won't be as fast as UBOs either. If you're just pulling one float from them, it shouldn't be a concern. But if you're pulling lots of data from them, consider using a UBO instead.
Shader Storage Buffer Objects
OpenGL 4.3 provides another way to handle this: shader storage buffers. They're a lot like uniform buffers; you specify them using syntax almost identical to that of uniform blocks. The principle difference is that you can write to them. Obviously that's not useful for your needs, but there are other differences.
Shader storage buffers are, conceptually speaking, an alternate form of buffer texture. Thus, the size limits for shader storage buffers are a lot larger than for uniform buffers. The OpenGL minimum for the max UBO size is 16KB. The OpenGL minimum for the max SSBO size is 16MB. So if you have the hardware, they're an interesting alternative to UBOs.
Just be sure to declare them as readonly
, since you're not writing to them.
The potential disadvantage here is performance again, relative to UBOs. SSBOs work like an image load/store operation through buffer textures. Basically, it's (very nice) syntactic sugar around an imageBuffer
image type. As such, reads from these will likely perform at the speed of reads from a readonly imageBuffer
.
Whether reading via image load/store through buffer images is faster or slower than buffer textures is unclear at this point.
Another potential issue is that you must abide by the rules for non-synchronous memory access. These are complex and can very easily trip you up.
Solution 2:
This sounds like a nice use case for texture buffer objects. These don't have much to do with regular textures and basically allow you to access a buffer object's memory in a shader as a simple linear array. They are similar to 1D textures, but are not filtered and only accessed by an integer index, which sounds like what you need to do when you call it a list of values. And they also support much larger sizes than 1D textures. For updating it you can then use the standard buffer object methods (glBufferData
, glMapBuffer
, ...).
But on the other hand they require GL3/DX10 hardware to use and have even been made core in OpenGL 3.1, I think. If your hardware/driver doesn't support it, then your 2nd solution would be the method of choice, but rather use a 1D texture than a width x 1 2D texture). In this case you can also use a non-flat 2D texture and some index magic to support lists larger than the maximum texture size.
But texture buffers are the perfect match for your problem, I think. For more exact insight you might also look into the corresponding extension specification.
EDIT: In response to Nicol's comment about uniform buffer objects, you can also look here for a little comparison of the two. I still tend to TBOs, but cannot really reason why, only because I see it a better fit conceptually. But maybe Nicol can provide an anwer with some more insight into the matter.
Solution 3:
One way would be to use uniform arrays like you mention. Another way to do it, is to use a 1D "texture". Look for GL_TEXTURE_1D and glTexImage1D. I personally prefer this way as you don't need to hardcode the size of the array in the shader code as you said, and opengl already has built-in functions for uploading/accessing 1D data on the GPU.