How are mutexes implemented?

Check out the description of the Test-and-set machine instruction on Wikipedia, which alludes to how atomic operations are achieved at the machine level. I can imagine most language-level mutex implementations rely on machine-level support such as Test-and-set.


Building on Adamski's test-and-set suggestion, you should also look at the concept of "fast user-space mutexes" or futexes.

Futexes have the desirable property that they do not require a kernel system call in the common cases of locking or unlocking an uncontended mutex. In these cases, the user-mode code successfully uses an atomic compare and swap (CAS) operation to lock or unlock the mutex.

If CAS fails, the mutex is contended and a kernel system call -- sys_futex under Linux -- must be used either to wait for the mutex (in the lock case) or to wake other threads (in the unlock case).

If you're serious about implementing this yourself, make sure you also read Ulrich Drepper's paper.


A mutex preferably runs in the kernel of the operating system while keeping the amount of code around it as short as possible, so it can avoid being cut-off while task-switching to another process. The exact implementation is therefore a bit of a secret. It's not complex though. It's basically an object that has a boolean field, which it gets and sets.

  • When using a counter, it can become a Semaphore.
  • A mutex is the starting point for a critical section, which uses a mutex internally to see if it can enter a section of code. If the mutex is free, it sets the mutex and executes the code, only to release the mutex when done. When a critical section notices that a mutex is locked, it can wait for the mutex to be released.

Around the basic mutex logic there are wrappers to wrap it in an object.. Then more wrapper objects to make it available outside the kernel. And then another wrapper to make it available in .NET. And then several programmers will write their own wrapper code around this all for their own logical needs. The wrappers around wrappers really make them a murky territory.

Now, with this basic knowledge about the internals of mutexes, all I hope is that you're going to use one implementation that relies on the kernel and the hardware underneath. These would be the most reliable. (If the hardware supports these.) If the mutex that you're using doesn't work at this kernel/hardware level then it can still be reliable but I would advise to not use it, unless there's no alternative.

As far as I know, Windows, Linux and .NET will all use mutexes at kernel/hardware level.

The Wikipedia page that I've linked to explains more about the internal logic and possible implementations. Preferably, a mutex is controlled by the hardware, thus making the whole getting/setting of the mutex an indivisible step. (Just to make sure the system doesn't switch tasks in-between.)