Why is -(-2147483648) = - 2147483648 in a 32-bit machine?

Solution 1:

Negating an (unsuffixed) integer constant:

The expression -(-2147483648) is perfectly defined in C, however it may be not obvious why it is this way.

When you write -2147483648, it is formed as unary minus operator applied to integer constant. If 2147483648 can't be expressed as int, then it s is represented as long or long long* (whichever fits first), where the latter type is guaranteed by the C Standard to cover that value.

To confirm that, you could examine it by:

printf("%zu\n", sizeof(-2147483648));

which yields 8 on my machine.

The next step is to apply second - operator, in which case the final value is 2147483648L (assuming that it was eventually represented as long). If you try to assign it to int object, as follows:

int n = -(-2147483648);

then the actual behavior is implementation-defined. Referring to the Standard:

C11 §6.3.1.3/3 Signed and unsigned integers

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

The most common way is to simply cut-off the higher bits. For instance, GCC documents it as:

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

Conceptually, the conversion to type of width 32 can be illustrated by bitwise AND operation:

value & (2^32 - 1) // preserve 32 least significant bits

In accordance with two's complement arithmetic, the value of n is formed with all zeros and MSB (sign) bit set, which represents value of -2^31, that is -2147483648.

Negating an int object:

If you try to negate int object, that holds value of -2147483648, then assuming two's complement machine, the program will exhibit undefined behavior:

n = -n; // UB if n == INT_MIN and INT_MAX == 2147483647

C11 §6.5/5 Expressions

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

Additional references:

  • INT32-C. Ensure that operations on signed integers do not result in overflow

*) In withdrawed C90 Standard, there was no long long type and the rules were different. Specifically, sequence for unsuffixed decimal was int, long int, unsigned long int (C90 §6.1.3.2 Integer constants).

†) This is due to LLONG_MAX, which must be at least +9223372036854775807 (C11 §5.2.4.2.1/1).

Solution 2:

Note: this answer does not apply as such on the obsolete ISO C90 standard that is still used by many compilers

First of all, on C99, C11, the expression -(-2147483648) == -2147483648 is in fact false:

int is_it_true = (-(-2147483648) == -2147483648);
printf("%d\n", is_it_true);

prints

0

So how it is possible that this evaluates to true? The machine is using 32-bit two's complement integers. The 2147483648 is an integer constant that quite doesn't fit in 32 bits, thus it will be either long int or long long int depending on whichever is the first where it fits. This negated will result in -2147483648 - and again, even though the number -2147483648 can fit in a 32-bit integer, the expression -2147483648 consists of a >32-bit positive integer preceded with unary -!

You can try the following program:

#include <stdio.h>

int main() {
    printf("%zu\n", sizeof(2147483647));
    printf("%zu\n", sizeof(2147483648));
    printf("%zu\n", sizeof(-2147483648));
}

The output on such machine most probably would be 4, 8 and 8.

Now, -2147483648 negated will again result in +214783648, which is still of type long int or long long int, and everything is fine.

In C99, C11, the integer constant expression -(-2147483648) is well-defined on all conforming implementations.


Now, when this value is assigned to a variable of type int, with 32 bits and two's complement representation, the value is not representable in it - the values on 32-bit 2's complement would range from -2147483648 to 2147483647.

The C11 standard 6.3.1.3p3 says the following of integer conversions:

  • [When] the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

That is, the C standard doesn't actually define what the value in this case would be, or doesn't preclude the possibility that the execution of the program stops due to a signal being raised, but leaves it to the implementations (i.e. compilers) to decide how to handle it (C11 3.4.1):

implementation-defined behavior

unspecified behavior where each implementation documents how the choice is made

and (3.19.1):

implementation-defined value

unspecified value where each implementation documents how the choice is made


In your case, the implementation-defined behaviour is that the value is the 32 lowest-order bits [*]. Due to the 2's complement, the (long) long int value 0x80000000 has the bit 31 set and all other bits cleared. In 32-bit two's complement integers the bit 31 is the sign bit - meaning that the number is negative; all value bits zeroed means that the value is the minimum representable number, i.e. INT_MIN.


[*] GCC documents its implementation-defined behaviour in this case as follows:

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

Solution 3:

This is not a C question, for on a C implementation featuring 32-bit two's complement representation for type int, the effect of applying the unary negation operator to an int having the value -2147483648 is undefined. That is, the C language specifically disavows designating the result of evaluating such an operation.

Consider more generally, however, how the unary - operator is defined in two's complement arithmetic: the inverse of a positive number x is formed by flipping all the bits of its binary representation and adding 1. This same definition serves as well for any negative number that has at least one bit other than its sign bit set.

Minor problems arise, however, for the two numbers that have no value bits set: 0, which has no bits set at all, and the number that has only its sign bit set (-2147483648 in 32-bit representation). When you flip all the bits of either of these, you end up with all value bits set. Therefore, when you subsequently add 1, the result overflows the value bits. If you imagine performing the addition as if the number were unsigned, treating the sign bit as a value bit, then you get

    -2147483648 (decimal representation)
-->  0x80000000 (convert to hex)
-->  0x7fffffff (flip bits)
-->  0x80000000 (add one)
--> -2147483648 (convert to decimal)

Similar applies to inverting zero, but in that case the overflow upon adding 1 overflows the erstwhile sign bit, too. If the overflow is ignored, the resulting 32 low-order bits are all zero, hence -0 == 0.