Detecting signed overflow in C/C++
At first glance, this question may seem like a duplicate of How to detect integer overflow?, however it is actually significantly different.
I've found that while detecting an unsigned integer overflow is pretty trivial, detecting a signed overflow in C/C++ is actually more difficult than most people think.
The most obvious, yet naive, way to do it would be something like:
int add(int lhs, int rhs)
{
int sum = lhs + rhs;
if ((lhs >= 0 && sum < rhs) || (lhs < 0 && sum > rhs)) {
/* an overflow has occurred */
abort();
}
return sum;
}
The problem with this is that according to the C standard, signed integer overflow is undefined behavior. In other words, according to the standard, as soon as you even cause a signed overflow, your program is just as invalid as if you dereferenced a null pointer. So you can't cause undefined behavior, and then try to detect the overflow after the fact, as in the above post-condition check example.
Even though the above check is likely to work on many compilers, you can't count on it. In fact, because the C standard says signed integer overflow is undefined, some compilers (like GCC) will optimize away the above check when optimization flags are set, because the compiler assumes a signed overflow is impossible. This totally breaks the attempt to check for overflow.
So, another possible way to check for overflow would be:
int add(int lhs, int rhs)
{
if (lhs >= 0 && rhs >= 0) {
if (INT_MAX - lhs <= rhs) {
/* overflow has occurred */
abort();
}
}
else if (lhs < 0 && rhs < 0) {
if (lhs <= INT_MIN - rhs) {
/* overflow has occurred */
abort();
}
}
return lhs + rhs;
}
This seems more promising, since we don't actually add the two integers together until we make sure in advance that performing such an add will not result in overflow. Thus, we don't cause any undefined behavior.
However, this solution is unfortunately a lot less efficient than the initial solution, since you have to perform a subtract operation just to test if your addition operation will work. And even if you don't care about this (small) performance hit, I'm still not entirely convinced this solution is adequate. The expression lhs <= INT_MIN - rhs
seems exactly like the sort of expression the compiler might optimize away, thinking that signed overflow is impossible.
So is there a better solution here? Something that is guaranteed to 1) not cause undefined behavior, and 2) not provide the compiler with an opportunity to optimize away overflow checks? I was thinking there might be some way to do it by casting both operands to unsigned, and performing checks by rolling your own two's-complement arithmetic, but I'm not really sure how to do that.
Solution 1:
No, your 2nd code isn't correct, but you are close: if you set
int half = INT_MAX/2;
int half1 = half + 1;
the result of an addition is INT_MAX
. (INT_MAX
is always an odd number). So this is valid input. But in your routine you will have INT_MAX - half == half1
and you would abort. A false positive.
This error can be repaired by putting <
instead of <=
in both checks.
But then also your code isn't optimal. The following would do:
int add(int lhs, int rhs)
{
if (lhs >= 0) {
if (INT_MAX - lhs < rhs) {
/* would overflow */
abort();
}
}
else {
if (rhs < INT_MIN - lhs) {
/* would overflow */
abort();
}
}
return lhs + rhs;
}
To see that this is valid, you have to symbolically add lhs
on both sides of the inequalities, and this gives you exactly the arithmetical conditions that your result is out of bounds.
Solution 2:
Your approach with subtraction is correct and well-defined. A compiler cannot optimize it away.
Another correct approach, if you have a larger integer type available, is to perform the arithmetic in the larger type and then check that the result fits in the smaller type when converting it back
int sum(int a, int b)
{
long long c;
assert(LLONG_MAX>INT_MAX);
c = (long long)a + b;
if (c < INT_MIN || c > INT_MAX) abort();
return c;
}
A good compiler should convert the entire addition and if
statement into an int
-sized addition and a single conditional jump-on-overflow and never actually perform the larger addition.
Edit: As Stephen pointed out, I'm having trouble getting a (not-so-good) compiler, gcc, to generate the sane asm. The code it generates is not terribly slow, but certainly suboptimal. If anyone knows variants on this code that will get gcc to do the right thing, I'd love to see them.