What is the loop inversion technique?

while (condition) { 
  ... 
}

Workflow:

  1. check condition;
  2. if false, jump to outside of loop;
  3. run one iteration;
  4. jump to top.

if (condition) do {
  ...
} while (condition);

Workflow:

  1. check condition;
  2. if false, jump to beyond the loop;
  3. run one iteration;
  4. check condition;
  5. if true, jump to step 3.

Comparing these two you can easily see that the latter may not do any jumps at all, provided that there is exactly one step through the loop, and generally the number of jumps will be one less than the number of iterations. The former will have to jump back to check the condition, only to jump out of the loop when the condition is false.

Jumps on modern pipelined CPU architectures can be quite expensive: as the CPU is finishing the execution of the checks before the jump, the instructions beyond that jump are already in the middle of the pipeline. All this processing must be discarded if the branch prediction fails. Further execution is delayed while the pipeline is being reprimed.

Explaining the mentioned branch prediction: for each kind of conditional jump the CPU has two instructions, each including a bet on the outcome. For example, you would put an instruction saying "jump if not zero, betting on not zero" at the end of a loop because the jump will have to be made on all iterations except the last one. That way the CPU starts pumping its pipeline with the instructions following the jump target instead of those following the jump instruction itself.

Important note

Please do not take this as an example of how to optimize at the source code level. That would be entirely misguided since, as already clear from your question, the transformation from the first form into the second is something the JIT compiler does as a matter of routine, completely on its own.


This can optimize a loop that is always executed at least once.

A regular while loop will then always jump back to the start at least once and jump to the end once at the end. An example of a simple loop running once:

int i = 0;
while (i++ < 1) {
    //do something
}  

A do-while loop on the other hand will skip the first and last jump. Here is an equivalent loop to the one above, that will run without jumps:

int i = 0;
if (i++ < 1) {
    do {
        //do something
    } while (i++ < 1); 
}

Let's walk through them:

The while version:

void foo(int n) {
    while (n < 10) {
       use(n);
       ++n;
    }
    done();
}
  1. First we test n and jump to done(); if the condition is not true.
  2. Then we use and increment n.
  3. Now we jump back to the condition.
  4. Rinse, repeat.
  5. When the condition is no longer true, we jump to done().

The do-while version:

(Remember, we don't actually do this in the source code [that would introduce maintenance issues], the compiler/JIT does it for us.)

void foo(int n) {
    if (n < 10) {
        do {
            use(n);
            ++n;
        }
        while (n < 10);
    }
    done();
}
  1. First we test n and jump to done(); if the condition is not true.
  2. Then we use and increment n.
  3. Now we test the condition and jump back if it's true.
  4. Rinse, repeat.
  5. When the condition is no longer true, we flow (not jump) to done().

So for instance, if n starts out being 9, we never jump at all in the do-while version, whereas in the while version we have to jump back to the beginning, do the test, and then jump back to the end when we see it isn't true.


Loop inversion is a performance optimization technique which improves performance as the processor can accomplish the same result with fewer instructions. This should mostly improve the performance in boundary conditions.

This link provides another example for loop inversion. In few architectures where decrement and compare is implemented as a single instruction set, It makes sense to convert a for loop to a while with decrement and compare operation.

Wikipedia has a very good example and I am explaining it here again.

 int i, a[100];
  i = 0;
  while (i < 100) {
    a[i] = 0;
    i++;
  }

will be converted by the compiler to

  int i, a[100];
  i = 0;
  if (i < 100) {
    do {
      a[i] = 0;
      i++;
    } while (i < 100);
  }

How this translates to performance? When the value of i is 99, the processor need not perform a GOTO (which is required in the first case). This improves performance.