Determining if a game is CPU- or GPU-limited

What is a reliable, but not too time-consuming way to determine if a certain game is limited by the graphics card or by the processor speed?

I have some preconceptions about whether I'm mostly GPU or CPU limited, but I'd like to verify if I'm right about that. I have some ideas how I would go around and test that but performing accurate benchmarks is notoriously tricky and I don't have much experience with it.

So I'm wondering what would be an easy way to determine the bottleneck, having only one computer with a certain configuration available? What tools would I use for that purpose?

It would also be interesting to determine if the amount of VRAM on my graphics card is a limiting factor.


Solution 1:

Two rules of thumb I used to use:

If increasing the resolution brought about a large drop in the frame rate, it could indicate that the game was GPU-bound, as the increased resolution makes the video card work much harder and, as a result, takes longer to get each frame out the door.

On the other hand, if increasing the resolution brought about only a negligible decrease in frame rate (or no decrease at all), then it was an indicator that the game was CPU-bound, as the additional video complexity was easily handled by spare video card processing capacity whereas the main game processing logic already pegged the CPU. In this situation, the game engine is too busy to be able give the video subsystem enough work.

However, I don't think games nowadays are so simple that the performance is easily limited like that. For many years now, GPUs have gotten more complex and developers are finding ways to offload more work to them, so there's a lot of work that could be done on either side... this is why I don't use these rules of thumb very much anymore - they just don't really apply. The assumption was that raising the resolution did not cause additional work on the CPU.

If you followed those video card benchmarks from Tom's Hardware Guide religiously in those early 2000's years, you could see this sometimes - running e.g. Quake 3 using a "modern" video card and you'd get so many bar charts that just flatlined.

If you can measure the frame rate of your game, give this a try and see what happens.

Solution 2:

As reducing the screen resolution and/or texture details options is pretty much guaranteed to improve the performance of any game, this can't be used to determine whether it's GPU bound or not. You could look to see what the highest resolution/texture detail level is available for your system.

If you could reduce the performance of your CPU (underclocking?) then this might give you some indication of whether the game was CPU bound - but again I don't think it can be 100% reliable.

There's also a blurring of what's done by the game engine (CPU) and what's done by the rendering engine (GPU). In the late 1990s/early 200s game physics used to be done by the CPU, but then graphics cards started being capable of performing these calculations on dedicated hardware thus speeding them up and improving performance. This means that seeing how many objects are moving on the screen (for example) can't be used as a guide to how powerful a CPU you have as a lot of the motion might be being controlled by the GPU based physics engine.

One thing to bear in mind is that due to the wide range of hardware that PC games have to run on game developers will be on the look out for any tricks that can improve performance and also the game will (hopefully) degrade gracefully so that it's playable on the lower end machines. This means that if you have the hardware available it will get used, if not to the utmost, then pretty close.

Solution 3:

from a purely observational perspective (not checking other applications etc) I believe that if you are CPU bound you'll find the that there is a lot more frame stutter as the GPU waits on the CPU, where as if you are GPU bound there will be a more consistent experience, albeit slow and laggy.

this could all be hooey though.

Solution 4:

If you're looking for a "quick and dirty" ballpark guess, here are two tests you can run:

  1. Use a system monitor or profiling tool to compare CPU to GPU usage and see if one is significantly underutilized. My suggestion is GPUView.
  2. Perform all GPU bottleneck tests at the same time and see if performance improves:
    • Reduce color/depth buffer bit depths.
    • Use highest mip level (lowest resolution) textures.
    • Reduce resolution.
    • Simplify vertex shader.
    • Reduce vertex format size.

What they mean:

  • Smaller bit depth means fewer resources used per pixel, revealing bottlenecks at the framebuffer level. Possible fixes:
    • Render depth first
    • Use less alpha blending
    • Disable depth writes where possible
    • Avoid needless buffer clears
    • Optimize skybox (render last with early z-out, or first with no depth r/w)
    • Use buffers with smaller bit depths
  • Smaller textures means fewer resources and fewer fetches, revealing bottlenecks in texture bandwidth. Possible fixes:
    • Use smaller textures
    • Use smaller texture bit depths
    • Compress textures
    • Use mipmapping
  • Reduced resolution means fewer pixels to process, revealing bottlenecks at both the framebuffer and fragment shader. Possible fixes:
    • Render depth first
    • Move work from fragment shader to vertex shader
    • Avoid excessive normalizing
    • Avoid excessive expensive texture filtering
    • Reduce fragment shader complexity :(
  • Smaller vertex shader means less processing per vertex, revealing bottlenecks at the vertex processing stage. Possible fixes:
    • Reduce number of vertices processed!
    • Move per-object computations to CPU-side
    • Use LOD
    • Do fewer transformations by starting with correct coordinate spaces
  • Smaller vertex formats means less data to transfer, revealing bottlenecks at the vertex/index transfer level. Possible fixes:
    • Use smaller vertex formats (if yours are needlessly fat)
    • Use smaller vertex formats (by deriving attributes from small types instead of storing bigger ones)
    • Use smaller index types
    • Access vertex/index data sequentially

If all of this is done and you don't see performance improvement, it's probably safe to say you're CPU-bound. Possible fixes:

  • More batching
  • Less locking
  • Do work further down the pipe
  • Get more realistic about your project

If this amount of GPU testing is still too extensive for you, you could pick the simplest tweaks out of it (such as reducing resolution) and hope it tells you something, but your results will obviously be less conclusive, since this will not reveal bottlenecks at certain stages of the GPU pipeline.

If you are GPU-bound, to further isolate bottlenecks on the GPU side, you'll want to perform these tests one at a time, starting from the bottom (end of the pipeline) and working your way up (framebuffer -> texture -> fragment shading -> vertex processing -> vertex transfer).

NOTE: This is of course assuming that you either have access to settings that allow you to make these modifications, or that you're a developer and it's a game for which you have access to source code.