Why did GCC generate mov %eax,%eax and what does it mean?
GCC 4.4.3 generated the following x86_64 assembly. The part that confuses me is the mov %eax,%eax
. Move the register to itself? Why?
23b6c: 31 c9 xor %ecx,%ecx ; the 0 value for shift
23b6e: 80 7f 60 00 cmpb $0x0,0x60(%rdi) ; is it shifted?
23b72: 74 03 je 23b77
23b74: 8b 4f 64 mov 0x64(%rdi),%ecx ; is shifted so load shift value to ecx
23b77: 48 8b 57 38 mov 0x38(%rdi),%rdx ; map base
23b7b: 48 03 57 58 add 0x58(%rdi),%rdx ; plus offset to value
23b7f: 8b 02 mov (%rdx),%eax ; load map_used value to eax
23b81: 89 c0 mov %eax,%eax ; then what the heck is this? promotion from uint32 to 64-bit size_t?
23b83: 48 d3 e0 shl %cl,%rax ; shift rax/eax by cl/ecx
23b86: c3 retq
The C++ code for this function is:
uint32_t shift = used_is_shifted ? shift_ : 0;
le_uint32_t le_map_used = *used_p();
size_t map_used = le_map_used;
return map_used << shift;
An le_uint32_t
is a class which wraps byte-swap operations on big-endian machines. On x86 it does nothing. The used_p()
function computes a pointer from the map base + offset and returns a pointer of the correct type.
Solution 1:
In x86-64, 32-bit instructions implicitly zero-extend: bits 32-63 are cleared (to avoid false dependencies). So sometimes that's why you'll see odd-looking instructions. (Is mov %esi, %esi a no-op or not on x86-64?)
However, in this case the previous mov
-load is also 32-bit so the high half of %rax
is already cleared. The mov %eax, %eax
appears to be redundant, apparently just a GCC missed optimization.