How does OpenGL work at the lowest level? [closed]

This question is almost impossible to answer because OpenGL by itself is just a front end API, and as long as an implementations adheres to the specification and the outcome conforms to this it can be done any way you like.

The question may have been: How does an OpenGL driver work on the lowest level. Now this is again impossible to answer in general, as a driver is closely tied to some piece of hardware, which may again do things however the developer designed it.

So the question should have been: "How does it look on average behind the scenes of OpenGL and the graphics system?". Let's look at this from the bottom up:

  1. At the lowest level there's some graphics device. Nowadays these are GPUs which provide a set of registers controlling their operation (which registers exactly is device dependent) have some program memory for shaders, bulk memory for input data (vertices, textures, etc.) and an I/O channel to the rest of the system over which it recieves/sends data and command streams.

  2. The graphics driver keeps track of the GPUs state and all the resources application programs that make use of the GPU. Also it is responsible for conversion or any other processing the data sent by applications (convert textures into the pixelformat supported by the GPU, compile shaders in the machine code of the GPU). Furthermore it provides some abstract, driver dependent interface to application programs.

  3. Then there's the driver dependent OpenGL client library/driver. On Windows this gets loaded by proxy through opengl32.dll, on Unix systems this resides in two places:

    • X11 GLX module and driver dependent GLX driver
    • and /usr/lib/libGL.so may contain some driver dependent stuff for direct rendering

    On MacOS X this happens to be the "OpenGL Framework".

    It is this part that translates OpenGL calls how you do it into calls to the driver specific functions in the part of the driver described in (2).

  4. Finally the actual OpenGL API library, opengl32.dll in Windows, and on Unix /usr/lib/libGL.so; this mostly just passes down the commands to the OpenGL implementation proper.

How the actual communication happens can not be generalized:

In Unix the 3<->4 connection may happen either over Sockets (yes, it may, and does go over network if you want to) or through Shared Memory. In Windows the interface library and the driver client are both loaded into the process address space, so that's no so much communication but simple function calls and variable/pointer passing. In MacOS X this is similar to Windows, only that there's no separation between OpenGL interface and driver client (that's the reason why MacOS X is so slow to keep up with new OpenGL versions, it always requires a full operating system upgrade to deliver the new framework).

Communication betwen 3<->2 may go through ioctl, read/write, or through mapping some memory into process address space and configuring the MMU to trigger some driver code whenever changes to that memory are done. This is quite similar on any operating system since you always have to cross the kernel/userland boundary: Ultimately you go through some syscall.

Communication between system and GPU happen through the periphial bus and the access methods it defines, so PCI, AGP, PCI-E, etc, which work through Port-I/O, Memory Mapped I/O, DMA, IRQs.


When I compile this program, will it be turned into a series of ioctl-calls, and the gpu driver then sends the appropriate commands to the gpu, where all the logic of rotating the triangle and setting the appropriate pixels in the appropriate color is wired in? Or will the program be compiled into a "gpu program" which is loaded onto the gpu and computes the rotation etc.?

You're not far off. Your program calls the installable client driver (which is not really a driver, it's a userspace shared library). That will use ioctl or a similar mechanism to pass data to the kernel driver.

For the next part, it depends on the hardware. Older video cards had what is called a "fixed-function pipeline". There were dedicated memory spaces in the video card for matrices, and dedicated hardware for texture lookup, blending, etc. The video driver would load the right data and flags for each of these units and then set up DMA to transfer your vertex data (position, color, texture coordinates, etc).

Newer hardware has processor cores ("shaders") inside the video card, which differ from your CPU in that they each run much slower, but there are many more of them working in parallel. For these video cards, the driver prepares program binaries to run on the GPU shaders.