Why doesn't left bit-shift, "<<", for 32-bit integers work as expected when used more than 32 times?

When I write the following program and use the GNU C++ compiler, the output is 1 which I think is due to the rotation operation performed by the compiler.

#include <iostream>

int main()
{
    int a = 1;
    std::cout << (a << 32) << std::endl;

    return 0;
}

But logically, as it's said that the bits are lost if they overflow the bit width, the output should be 0. What is happening?

The code is on ideone, http://ideone.com/VPTwj.


This is caused due to a combination of an undefined behaviour in C and the fact that code generated for IA-32 processors has a 5 bit mask applied on the shift count. This means that on IA-32 processors, the range of a shift count is 0-31 only. 1

From The C programming language 2

The result is undefined if the right operand is negative, or greater than or equal to the number of bits in the left expression’s type.

From IA-32 Intel Architecture Software Developer’s Manual 3

The 8086 does not mask the shift count. However, all other IA-32 processors (starting with the Intel 286 processor) do mask the shift count to 5 bits, resulting in a maximum count of 31. This masking is done in all operating modes (including the virtual-8086 mode) to reduce the maximum execution time of the instructions.



1http://codeyarns.com/2004/12/20/c-shift-operator-mayhem/

2 A7.8 Shift Operators, Appendix A. Reference Manual, The C Programming Language

3 SAL/SAR/SHL/SHR – Shift, Chapter 4. Instruction Set Reference, IA-32 Intel Architecture Software Developer’s Manual


In C++, shift is only well-defined if you shift a value less steps than the size of the type. If int is 32 bits, then only 0 to, and including, 31 steps is well-defined.

So, why is this?

If you take a look at the underlying hardware that performs the shift, if it only has to look at the lower five bits of a value (in the 32 bit case), it can be implemented using less logical gates than if it has to inspect every bit of the value.

Answer to question in comment

C and C++ are designed to run as fast as possible, on any available hardware. Today, the generated code is simply a ''shift'' instruction, regardless how the underlying hardware handles values outside the specified range. If the languages would have specified how shift should behave, the generated could would have to check that the shift count is in range before performing the shift. Typically, this would yield three instructions (compare, branch, shift). (Admittedly, in this case it would not be necessary as the shift count is known.)


It's undefined behaviour according to the C++ standard:

The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.


The answers of Lindydancer and 6502 explain why (on some machines) it happens to be a 1 that is being printed (although the behavior of the operation is undefined). I am adding the details in case they aren't obvious.

I am assuming that (like me) you are running the program on an Intel processor. GCC generates these assembly instructions for the shift operation:

movl $32, %ecx
sall %cl, %eax

On the topic of sall and other shift operations, page 624 in the Instruction Set Reference Manual says:

The 8086 does not mask the shift count. However, all other Intel Architecture processors (starting with the Intel 286 processor) do mask the shift count to five bits, resulting in a maximum count of 31. This masking is done in all operating modes (including the virtual-8086 mode) to reduce the maximum execution time of the instructions.

Since the lower 5 bits of 32 are zero, then 1 << 32 is equivalent to 1 << 0, which is 1.

Experimenting with larger numbers, we would predict that

cout << (a << 32) << " " << (a << 33) << " " << (a << 34) << "\n";

would print 1 2 4, and indeed that is what is happening on my machine.