Is this a compiler optimisation bug, or an undefined behaviour?

Solution 1:

This is a code optimizer bug. It inlines both getData() and SetBit(). The combination appears to be fatal, it loses track of the value of 1 << ((pos) & 7) and always produces zero.

This bug does not occur on VS2012. A workaround is to force one of the functions to not get inlined. Given the code, you probably want to do that for getData():

__declspec(noinline)
size_t getData(size_t p_value)
{ 
    // etc..
}

Solution 2:

Addendum 2 The smallest possible part of the OP's code is given below. This snippet leads to the said optimizer bug in VS2010 - dependend on the contents of inline-expanded GetData(). Even if one combines the two returns in GetData() into one the bug is "gone". Also, it does not lead to a bug if you combine bits in only the first byte (like char bitmap[1]; - you need two bytes).

The problem does not occur under VS2012. This feels horrible because MS fixed that obviously in 2012 but not in 2010. WTF?

BTW:

  • g++ 4.6.2 x64 (-O3) -- ok
  • icpc 12.1.0 x64 (-O3) -- ok

VS2010 optimizer bug verification:

#include <iostream>
const size_t B_5=5, B_9=9;

size_t GetBit(unsigned char * b, size_t p) { return b[p>>3]  & (1 << (p & 7)); }
void   SetBit(unsigned char * b, size_t p) {        b[p>>3] |= (1 << (p & 7)); }

size_t GetData(size_t p) {
   if (p == B_5) return B_9;
   return 0;
}
/* SetBit-invocation will fail (write 0) 
   if inline-expanded in the vicinity of the GetData function, VS2010 */

 int main(int argc, char * argv[])
{
 unsigned char bitmap[2] = { 0, 0 };
 SetBit(bitmap, B_5);

 for(size_t i=0; i<2*8; ++i) {
    if( GetBit(bitmap, i) )         // no difference if temporary variable used,
        SetBit(bitmap, GetData(i)); // the optimizer will drop it anyway
 }

 const size_t byte=bitmap[1], bit=1<<1, value=byte & bit;
 std::cout << (value == 0 ? "ERROR: The bit 9 should NOT be 0" 
                          : "Ok: The bit 9 is 1") << std::endl;
 return 0;
}

After some inspection one can see that the initialization/zeroing part is not part of this specific problem.

Looked again after the meal. Seems to be a char/int propagation error. Can be cured by changing the mask functions (as has been already found out by the OP) to:

size_t TestBit  (const unsigned char * bits, size_t pos) { 
 return (bits)[pos >> 3] &   (1 << ( char(pos) & 7) ) ; 
}
size_t SetBit   (unsigned char * bits, size_t pos)       { 
 return (bits)[pos >> 3] |=  (1 << ( char(pos) & 7) ) ; 
}
size_t ResetBit (unsigned char * bits, size_t pos)       { 
 return (bits)[pos >> 3] &= ~(1 << ( char(pos) & 7) ) ; 
}

by casting the int-sized position pos to a char-size. This will lead the optimizer in VS2010 to do the right thing. Maybe somebody can comment.