How does the CPU and GPU interact in displaying computer graphics?
Solution 1:
I decided to write a bit about the programming aspect and how components talk to each other. Maybe it'll shed some light on certain areas.
The Presentation
What does it take to even have that single image, that you posted in your question, drawn on the screen?
There are many ways to draw a triangle on the screen. For simplicity, let's assume no vertex buffers were used. (A vertex buffer is an area of memory where you store coordinates.) Let's assume the program simply told the graphics processing pipeline about every single vertex (a vertex is just a coordinate in space) in a row.
But, before we can draw anything, we first have to run some scaffolding. We'll see why later:
// Clear The Screen And The Depth Buffer
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
// Reset The Current Modelview Matrix
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
// Drawing Using Triangles
glBegin(GL_TRIANGLES);
// Red
glColor3f(1.0f,0.0f,0.0f);
// Top Of Triangle (Front)
glVertex3f( 0.0f, 1.0f, 0.0f);
// Green
glColor3f(0.0f,1.0f,0.0f);
// Left Of Triangle (Front)
glVertex3f(-1.0f,-1.0f, 1.0f);
// Blue
glColor3f(0.0f,0.0f,1.0f);
// Right Of Triangle (Front)
glVertex3f( 1.0f,-1.0f, 1.0f);
// Done Drawing
glEnd();
So what did that do?
When you write a program that wants to use the graphics card, you'll usually pick some kind of interface to the driver. Some well known interfaces to the driver are:
- OpenGL
- Direct3D
- CUDA
For this example we'll stick with OpenGL. Now, your interface to the driver is what gives you all the tools you need to make your program talk to the graphics card (or the driver, which then talks to the card).
This interface is bound to give you certain tools. These tools take the shape of an API which you can call from your program.
That API is what we see being used in the example above. Let's take a closer look.
The Scaffolding
Before you can really do any actual drawing, you'll have to perform a setup. You have to define your viewport (the area that will actually be rendered), your perspective (the camera into your world), what anti-aliasing you will be using (to smooth out the edged of your triangle)...
But we won't look at any of that. We'll just take a peek at the stuff you'll have to do every frame. Like:
Clearing the screen
The graphics pipeline is not going to clear the screen for you every frame. You'll have to tell it. Why? This is why:
If you don't clear the screen, you'll simply draw over it every frame. That's why we call glClear
with the GL_COLOR_BUFFER_BIT
set. The other bit (GL_DEPTH_BUFFER_BIT
) tells OpenGL to clear the depth buffer. This buffer is used to determine which pixels are in front (or behind) other pixels.
Transformation
Image source
Transformation is the part where we take all the input coordinates (the vertices of our triangle) and apply our ModelView matrix. This is the matrix that explains how our model (the vertices) are rotated, scaled, and translated (moved).
Next, we apply our Projection matrix. This moves all coordinates so that they face our camera correctly.
Now we transform once more, with our Viewport matrix. We do this to scale our model to the size of our monitor. Now we have a set of vertices that are ready to be rendered!
We'll come back to transformation a bit later.
Drawing
To draw a triangle, we can simply tell OpenGL to start a new list of triangles by calling glBegin
with the GL_TRIANGLES
constant.
There are also other forms you can draw. Like a triangle strip or a triangle fan. These are primarily optimizations, as they require less communication between the CPU and the GPU to draw the same amount of triangles.
After that, we can provide a list of sets of 3 vertices which should make up each triangle. Every triangle uses 3 coordinates (as we're in 3D-space). Additionally, I also provide a color for each vertex, by calling glColor3f
before calling glVertex3f
.
The shade between the 3 vertices (the 3 corners of the triangle) is calculated by OpenGL automatically. It will interpolate the color over the whole face of the polygon.
Interaction
Now, when you click the window. The application only has to capture the window message that signals the click. Then you can run any action in your program you want.
This gets a lot more difficult once you want to start interacting with your 3D scene.
You first have to clearly know at which pixel the user clicked the window. Then, taking your perspective into account, you can calculate the direction of a ray, from the point of the mouse click into your scene. You can then calculate if any object in your scene intersects with that ray. Now you know if the user clicked an object.
So, how do you make it rotate?
Transformation
I am aware of two types of transformations that are generally applied:
- Matrix-based transformation
- Bone-based transformation
The difference is that bones affect single vertices. Matrices always affect all drawn vertices in the same way. Let's look at an example.
Example
Earlier, we loaded our identity matrix before drawing our triangle. The identity matrix is one that simply provides no transformation at all. So, whatever I draw, is only affected by my perspective. So, the triangle will not be rotated at all.
If I want to rotate it now, I could either do the math myself (on the CPU) and simply call glVertex3f
with other coordinates (that are rotated). Or I could let the GPU do all the work, by calling glRotatef
before drawing:
// Rotate The Triangle On The Y axis
glRotatef(amount,0.0f,1.0f,0.0f);
amount
is, of course, just a fixed value. If you want to animate, you'll have to keep track of amount
and increase it every frame.
So, wait, what happened to all the matrix talk earlier?
In this simple example, we don't have to care about matrices. We simply call glRotatef
and it takes care of all that for us.
glRotate
produces a rotation ofangle
degrees around the vector x y z . The current matrix (see glMatrixMode) is multiplied by a rotation matrix with the product replacing the current matrix, as if glMultMatrix were called with the following matrix as its argument:x 2 1 - c + c x y 1 - c - z s x z 1 - c + y s 0 y x 1 - c + z s y 2 1 - c + c y z 1 - c - x s 0 x z 1 - c - y s y z 1 - c + x s z 2 1 - c + c 0 0 0 0 1
Well, thanks for that!
Conclusion
What becomes obvious is, there's a lot of talk to OpenGL. But it's not telling us anything. Where is the communication?
The only thing that OpenGL is telling us in this example is when it's done. Every operation will take a certain amount of time. Some operation take incredibly long, others are incredibly quick.
Sending a vertex to the GPU will be so fast, I wouldn't even know how to express it. Sending thousands of vertices from the CPU to the GPU, every single frame, is, most likely, no issue at all.
Clearing the screen can take a millisecond or worse (keep in mind, you usually only have about 16 milliseconds of time to draw each frame), depending on how large your viewport is. To clear it, OpenGL has to draw every single pixel in the color you want to clear to, that could be millions of pixels.
Other than that, we can pretty much only ask OpenGL about the capabilities of our graphics adapter (max resolution, max anti-aliasing, max color depth, ...).
But we can also fill a texture with pixels that each have a specific color. Each pixel thus holds a value and the texture is a giant "file" filled with data. We can load that into the graphics card (by creating a texture buffer), then load a shader, tell that shader to use our texture as an input and run some extremely heavy calculations on our "file".
We can then "render" the result of our computation (in the form of new colors) into a new texture.
That's how you can make the GPU work for you in other ways. I assume CUDA performs similar to that aspect, but I never had the opportunity to work with it.
We really only slightly touched the whole subject. 3D graphics programming is a hell of a beast.
Image Source
Solution 2:
It's hard to understand exactly what it is you don't understand.
The GPU has a series of registers that the BIOS maps. These permit the CPU to access the GPU's memory and instruct the GPU to perform operations. The CPU plugs values into those registers to map some of the GPU's memory so that the CPU can access it. Then it loads instructions into that memory. It then writes a value to a register that tells the GPU to execute the instructions the CPU loaded into its memory.
The information consists of the software that the GPU needs to run. This software is bundled with the driver and then the driver handles the responsibility split between the CPU and GPU (by running portions of its code on both devices).
The driver then manages a series of "windows" into GPU memory that the CPU can read from and write to. Generally, the access pattern involves the CPU writing instructions or information into mapped GPU memory and then instructing the GPU, through a register, to execute those instruction or process that information. The information includes shader logic, textures, and so on.