Which is the fastest way to get the absolute value of a number

There is a great trick to calculate the absolute value of a 2s-complement integer without using an if statement. The theory goes, if the value is negative you want to toggle the bits and add one, otherwise you want to pass the bits through as is. A XOR 1 happens to toggle A and A XOR 0 happens to leave A intact. So you want do something like this:

  uint32_t temp = value >> 31;     // make a mask of the sign bit
  value ^= temp;                   // toggle the bits if value is negative
  value += temp & 1;               // add one if value was negative

In principle, you can do it in as few as three assembly instructions (without a branch). And you'd like to think that the abs() function that you get with math.h does it optimally.

No branches == better performance. Contrary to @paxdiablo's response above, this really matters in deep pipelines where the more branches you have in your code, the more likely you are to have your branch predictor get it wrong and have to roll-back, etc. If you avoid branching where possible, things will keep moving along at full throttle in your core :).

Conditionals are slower than plain arithmetic operations, but much, much faster than something as silly as calculating the square root.

Rules of thumb from my assembly days:

Integer or bitwise op: 1 cycle
Floating-point add/sub/mul: 4 cycles
Floating-point div: ~30 cycles
Floating-point exponentiation: ~200 cycles
Floating-point sqrt: ~60 cycles depending on implementation
Conditional branch: avg. 10 cycles, better if well-predicted, much worse if mispredicted

Calculating the square root is probably one of the worst things you could do because it is really slow. Usually there is a library function for doing this; something like Math.Abs(). Multiplying with -1 is also unnecessary; just return -x. So a good solution would be the following.

(x >= 0) ? x : -x

The compiler will probably optimize this to a single instructions. Conditions may be quite expensive on modern processors because of the long execution pipelines -the calculations must be thrown away if a branch was misspredicted and the processor started executing the instructions from the wrong code path. But because of the mentioned compiler optimization you need not care in this case.

For completeness, here's a way to do it for IEEE floats on x86 systems in C++:

*(reinterpret_cast<uint32_t*>(&foo)) &= 0xffffffff >> 1;

Which is the fastest way to get the absolute value of a number

I think the "right" answer isn't here actually. The fastest way to get the absolute number is probably to use the Intel Intrinsic. See https://software.intel.com/sites/landingpage/IntrinsicsGuide/ and look for 'vpabs' (or another intrinsic that does the job for your CPU). I'm pretty sure it'll beat all the other solutions here.

If you don't like intrinsics (or cannot use them or ...), you might want to check if the Compiler is smart enough to figure out if a call to 'native absolute value' (std::abs in C++ or Math.Abs(x) in C#) will change automagically into the intrinsic - basically that involves looking at the disassembled (compiled) code. If you're in a JIT, be sure that JIT optimizations aren't disabled.

If that also doesn't give you the optimized instructions, you can use the method described here: https://graphics.stanford.edu/~seander/bithacks.html#IntegerAbs .

Which is the fastest way to get the absolute value of a number

Related

Recent Posts