How should I unit test multithreaded code?

I have thus far avoided the nightmare that is testing multi-threaded code since it just seems like too much of a minefield. I'd like to ask how people have gone about testing code that relies on threads for successful execution, or just how people have gone about testing those kinds of issues that only show up when two threads interact in a given manner?

This seems like a really key problem for programmers today, it would be useful to pool our knowledge on this one imho.


Look, there's no easy way to do this. I'm working on a project that is inherently multithreaded. Events come in from the operating system and I have to process them concurrently.

The simplest way to deal with testing complex, multithreaded application code is this: If it's too complex to test, you're doing it wrong. If you have a single instance that has multiple threads acting upon it, and you can't test situations where these threads step all over each other, then your design needs to be redone. It's both as simple and as complex as this.

There are many ways to program for multithreading that avoids threads running through instances at the same time. The simplest is to make all your objects immutable. Of course, that's not usually possible. So you have to identify those places in your design where threads interact with the same instance and reduce the number of those places. By doing this, you isolate a few classes where multithreading actually occurs, reducing the overall complexity of testing your system.

But you have to realize that even by doing this, you still can't test every situation where two threads step on each other. To do that, you'd have to run two threads concurrently in the same test, then control exactly what lines they are executing at any given moment. The best you can do is simulate this situation. But this might require you to code specifically for testing, and that's at best a half step towards a true solution.

Probably the best way to test code for threading issues is through static analysis of the code. If your threaded code doesn't follow a finite set of thread safe patterns, then you might have a problem. I believe Code Analysis in VS does contain some knowledge of threading, but probably not much.

Look, as things stand currently (and probably will stand for a good time to come), the best way to test multithreaded apps is to reduce the complexity of threaded code as much as possible. Minimize areas where threads interact, test as best as possible, and use code analysis to identify danger areas.


It's been a while when this question was posted, but it's still not answered ...

kleolb02's answer is a good one. I'll try going into more details.

There is a way, which I practice for C# code. For unit tests you should be able to program reproducible tests, which is the biggest challenge in multithreaded code. So my answer aims toward forcing asynchronous code into a test harness, which works synchronously.

It's an idea from Gerard Meszardos's book "xUnit Test Patterns" and is called "Humble Object" (p. 695): You have to separate core logic code and anything which smells like asynchronous code from each other. This would result to a class for the core logic, which works synchronously.

This puts you into the position to test the core logic code in a synchronous way. You have absolute control over the timing of the calls you are doing on the core logic and thus can make reproducible tests. And this is your gain from separating core logic and asynchronous logic.

This core logic needs be wrapped around by another class, which is responsible for receiving calls to the core logic asynchronously and delegates these calls to the core logic. Production code will only access the core logic via that class. Because this class should only delegate calls, it's a very "dumb" class without much logic. So you can keep your unit tests for this asychronous working class at a minimum.

Anything above that (testing interaction between classes) are component tests. Also in this case, you should be able to have absolute control over timing, if you stick to the "Humble Object" pattern.


Tough one indeed! In my (C++) unit tests, I've broken this down into several categories along the lines of the concurrency pattern used:

  1. Unit tests for classes that operate in a single thread and aren't thread aware -- easy, test as usual.

  2. Unit tests for Monitor objects (those that execute synchronized methods in the callers' thread of control) that expose a synchronized public API -- instantiate multiple mock threads that exercise the API. Construct scenarios that exercise internal conditions of the passive object. Include one longer running test that basically beats the heck out of it from multiple threads for a long period of time. This is unscientific I know but it does build confidence.

  3. Unit tests for Active objects (those that encapsulate their own thread or threads of control) -- similar to #2 above with variations depending on the class design. Public API may be blocking or non-blocking, callers may obtain futures, data may arrive at queues or need to be dequeued. There are many combinations possible here; white box away. Still requires multiple mock threads to make calls to the object under test.

As an aside:

In internal developer training that I do, I teach the Pillars of Concurrency and these two patterns as the primary framework for thinking about and decomposing concurrency problems. There's obviously more advanced concepts out there but I've found that this set of basics helps keep engineers out of the soup. It also leads to code that is more unit testable, as described above.


I have faced this issue several times in recent years when writing thread handling code for several projects. I'm providing a late answer because most of the other answers, while providing alternatives, do not actually answer the question about testing. My answer is addressed to the cases where there is no alternative to multithreaded code; I do cover code design issues for completeness, but also discuss unit testing.

Writing testable multithreaded code

The first thing to do is to separate your production thread handling code from all the code that does actual data processing. That way, the data processing can be tested as singly threaded code, and the only thing the multithreaded code does is to coordinate threads.

The second thing to remember is that bugs in multithreaded code are probabilistic; the bugs that manifest themselves least frequently are the bugs that will sneak through into production, will be difficult to reproduce even in production, and will thus cause the biggest problems. For this reason, the standard coding approach of writing the code quickly and then debugging it until it works is a bad idea for multithreaded code; it will result in code where the easy bugs are fixed and the dangerous bugs are still there.

Instead, when writing multithreaded code, you must write the code with the attitude that you are going to avoid writing the bugs in the first place. If you have properly removed the data processing code, the thread handling code should be small enough - preferably a few lines, at worst a few dozen lines - that you have a chance of writing it without writing a bug, and certainly without writing many bugs, if you understand threading, take your time, and are careful.

Writing unit tests for multithreaded code

Once the multithreaded code is written as carefully as possible, it is still worthwhile writing tests for that code. The primary purpose of the tests is not so much to test for highly timing dependent race condition bugs - it's impossible to test for such race conditions repeatably - but rather to test that your locking strategy for preventing such bugs allows for multiple threads to interact as intended.

To properly test correct locking behavior, a test must start multiple threads. To make the test repeatable, we want the interactions between the threads to happen in a predictable order. We don't want to externally synchronize the threads in the test, because that will mask bugs that could happen in production where the threads are not externally synchronized. That leaves the use of timing delays for thread synchronization, which is the technique that I have used successfully whenever I've had to write tests of multithreaded code.

If the delays are too short, then the test becomes fragile, because minor timing differences - say between different machines on which the tests may be run - may cause the timing to be off and the test to fail. What I've typically done is start with delays that cause test failures, increase the delays so that the test passes reliably on my development machine, and then double the delays beyond that so the test has a good chance of passing on other machines. This does mean that the test will take a macroscopic amount of time, though in my experience, careful test design can limit that time to no more than a dozen seconds. Since you shouldn't have very many places requiring thread coordination code in your application, that should be acceptable for your test suite.

Finally, keep track of the number of bugs caught by your test. If your test has 80% code coverage, it can be expected to catch about 80% of your bugs. If your test is well designed but finds no bugs, there's a reasonable chance that you don't have additional bugs that will only show up in production. If the test catches one or two bugs, you might still get lucky. Beyond that, and you may want to consider a careful review of or even a complete rewrite of your thread handling code, since it is likely that code still contains hidden bugs that will be very difficult to find until the code is in production, and very difficult to fix then.


I also had serious problems testing multi- threaded code. Then I found a really cool solution in "xUnit Test Patterns" by Gerard Meszaros. The pattern he describes is called Humble object.

Basically it describes how you can extract the logic into a separate, easy-to-test component that is decoupled from its environment. After you tested this logic, you can test the complicated behaviour (multi- threading, asynchronous execution, etc...)