C++ memory model and race conditions on char arrays
Basically I have trouble understanding this: (from Bjarne FAQ)
However, most modern processors cannot read or write a single character, it must read or write a whole word, so the assignment to c really is ``read the word containing c, replace the c part, and write the word back again.'' Since the assignment to b is similar, there are plenty of opportunities for the two threads to clobber each other even though the threads do not (according to their source text) share data!
So how can char arrays exist without 3(7?) byte padding between elements?
Solution 1:
I think Bjarne is wrong about this, or at least, he's
simplifying things considerably. Most modern processors are
capable of writing a byte without reading a complete word first,
or rather, they behave "as if" this were the case. In
particular, if you have a char array[2];
, and thread one only
accesses array[0]
and thread two only accesses array[1]
(including when both threads are mutating the value), then you
do not need any additional synchronization; this is guaranteed
by the standard. If the hardware does not allow this directly,
the compiler will have to add the synchronization itself.
It's very important to note the "as if", above. Modern hardware does access main memory by cache lines, not bytes. But it also has provisions for modifying single bytes in a cache line, so that when writing back, the processor core will not modify bytes that have not been modified in its cache.
Solution 2:
A platform that supports C++11 must be able to access storage of the size of one char
without inventing writes. x86 does indeed have that ability. If a processor must modify 32 bits at once at any time, it must have a 32-bit wide char
.
(Some background reasoning: arrays are stored contiguously, and chars have no padding (3.9.1).)