Are mutex lock functions sufficient without volatile?
A coworker and I write software for a variety of platforms running on x86, x64, Itanium, PowerPC, and other 10 year old server CPUs.
We just had a discussion about whether mutex functions such as pthread_mutex_lock() ... pthread_mutex_unlock() are sufficient by themselves, or whether the protected variable needs to be volatile.
int foo::bar()
{
//...
//code which may or may not access _protected.
pthread_mutex_lock(m);
int ret = _protected;
pthread_mutex_unlock(m);
return ret;
}
My concern is caching. Could the compiler place a copy of _protected on the stack or in a register, and use that stale value in the assignment? If not, what prevents that from happening? Are variations of this pattern vulnerable?
I presume that the compiler doesn't actually understand that pthread_mutex_lock() is a special function, so are we just protected by sequence points?
Thanks greatly.
Update: Alright, I can see a trend with answers explaining why volatile is bad. I respect those answers, but articles on that subject are easy to find online. What I can't find online, and the reason I'm asking this question, is how I'm protected without volatile. If the above code is correct, how is it invulnerable to caching issues?
Simplest answer is volatile
is not needed for multi-threading at all.
The long answer is that sequence points like critical sections are platform dependent as is whatever threading solution you're using so most of your thread safety is also platform dependent.
C++0x has a concept of threads and thread safety but the current standard does not and therefore volatile
is sometimes misidentified as something to prevent reordering of operations and memory access for multi-threading programming when it was never intended and can't be reliably used that way.
The only thing volatile
should be used for in C++ is to allow access to memory mapped devices, allow uses of variables between setjmp
and longjmp
, and to allow uses of sig_atomic_t
variables in signal handlers. The keyword itself does not make a variable atomic.
Good news in C++0x we will have the STL construct std::atomic
which can be used to guarantee atomic operations and thread safe constructs for variables. Until your compiler of choice supports it you may need to turn to the boost library or bust out some assembly code to create your own objects to provide atomic variables.
P.S. A lot of the confusion is caused by Java and .NET actually enforcing multi-threaded semantics with the keyword volatile
C++ however follows suit with C where this is not the case.
Your threading library should include the apropriate CPU and compiler barriers on mutex lock and unlock. For GCC, a memory
clobber on an asm statement acts as a compiler barrier.
Actually, there are two things that protect your code from (compiler) caching:
- You are calling a non-pure external function (
pthread_mutex_*()
), which means that the compiler doesn't know that that function doesn't modify your global variables, so it has to reload them. - As I said,
pthread_mutex_*()
includes a compiler barrier, e.g: on glibc/x86pthread_mutex_lock()
ends up calling the macrolll_lock()
, which has amemory
clobber, forcing the compiler to reload variables.
If the above code is correct, how is it invulnerable to caching issues?
Until C++0x, it is not. And it is not specified in C. So, it really depends on the compiler. In general, if the compiler does not guarantee that it will respect ordering constraints on memory accesses for functions or operations that involve multiple threads, you will not be able to write multithreaded safe code with that compiler. See Hans J Boehm's Threads Cannot be Implemented as a Library.
As for what abstractions your compiler should support for thread safe code, the wikipedia entry on Memory Barriers is a pretty good starting point.
(As for why people suggested volatile
, some compilers treat volatile
as a memory barrier for the compiler. It's definitely not standard.)
The volatile keyword is a hint to the compiler that the variable might change outside of program logic, such as a memory-mapped hardware register that could change as part of an interrupt service routine. This prevents the compiler from assuming a cached value is always correct and would normally force a memory read to retrieve the value. This usage pre-dates threading by a couple decades or so. I've seen it used with variables manipulated by signals as well, but I'm not sure that usage was correct.
Variables guarded by mutexes are guaranteed to be correct when read or written by different threads. The threading API is required to ensure that such views of variables are consistent. This access is all part of your program logic and the volatile keyword is irrelevant here.