Java ExecutorService: awaitTermination of all recursively created tasks

Solution 1:

This really is an ideal candidate for a Phaser. Java 7 is coming out with this new class. Its a flexible CountdonwLatch/CyclicBarrier. You can get a stable version at JSR 166 Interest Site.

The way it is a more flexible CountdownLatch/CyclicBarrier is because it is able to not only support an unknown number of parties (threads) but its also reusable (thats where the phase part comes in)

For each task you submit you would register, when that task is completed you arrive. This can be done recursively.

Phaser phaser = new Phaser();
ExecutorService e = //

Runnable recursiveRunnable = new Runnable(){
   public void run(){
      //do work recursively if you have to

      if(shouldBeRecursive){
           phaser.register();
           e.submit(recursiveRunnable);
      }

      phaser.arrive();
   }
}

public void doWork(){
   int phase = phaser.getPhase();

   phaser.register();
   e.submit(recursiveRunnable);

   phaser.awaitAdvance(phase);
}

Edit: Thanks @depthofreality for pointing out the race condition in my previous example. I am updating it so that executing thread only awaits advance of the current phase as it blocks for the recursive function to complete.

The phase number won't trip until the number of arrives == registers. Since prior to each recursive call invokes register a phase increment will happen when all invocations are complete.

Solution 2:

If number of tasks in the tree of recursive tasks is initially unknown, perhaps the easiest way would be to implement your own synchronization primitive, some kind of "inverse semaphore", and share it among your tasks. Before submitting each task you increment a value, when task is completed, it decrements that value, and you wait until the value is 0.

Implementing it as a separate primitive explicitly called from tasks decouples this logic from the thread pool implementation and allows you to submit several independent trees of recursive tasks into the same pool.

Something like this:

public class InverseSemaphore {
    private int value = 0;
    private Object lock = new Object();

    public void beforeSubmit() {
        synchronized(lock) {
            value++;
        }
    }

    public void taskCompleted() {
        synchronized(lock) {
            value--;
            if (value == 0) lock.notifyAll();
        }
    }

    public void awaitCompletion() throws InterruptedException {
        synchronized(lock) {
            while (value > 0) lock.wait();
        }
    }
}

Note that taskCompleted() should be called inside a finally block, to make it immune to possible exceptions.

Also note that beforeSubmit() should be called by the submitting thread before the task is submitted, not by the task itself, to avoid possible "false completion" when old tasks are completed and new ones not started yet.

EDIT: Important problem with usage pattern fixed.

Solution 3:

Wow, you guys are quick:)

Thank you for all the suggestions. Futures don't easily integrate with my model because I don't know how many runnables are scheduled beforehand. So if I keep a parent task alive just to wait for it's recursive child tasks to finish I have a lot of garbage laying around.

I solved my problem using the AtomicInteger suggestion. Essentially, I subclassed ThreadPoolExecutor and increment the counter on calls to execute() and decrement on calls to afterExecute(). When the counter gets 0 I call shutdown(). This seems to work for my problems, not sure if that's a generally good way to do that. Especially, I assume that you only use execute() to add Runnables.

As a side node: I first tried to check in afterExecute() the number of Runnables in the queue and the number of workers that are active and shutdown when those are 0; but that didn't work because not all Runnables showed up in the queue and the getActiveCount() didn't do what I expected either.

Anyhow, here's my solution: (if anybody finds serious problems with this, please let me know:)

public class MyThreadPoolExecutor extends ThreadPoolExecutor {

    private final AtomicInteger executing = new AtomicInteger(0);

    public MyThreadPoolExecutor(int coorPoolSize, int maxPoolSize, long keepAliveTime,
        TimeUnit seconds, BlockingQueue<Runnable> queue) {
        super(coorPoolSize, maxPoolSize, keepAliveTime, seconds, queue);
    }


    @Override
    public void execute(Runnable command) {
        //intercepting beforeExecute is too late!
        //execute() is called in the parent thread before it terminates
        executing.incrementAndGet();
        super.execute(command);
    }


    @Override
    protected void afterExecute(Runnable r, Throwable t) {
        super.afterExecute(r, t);
        int count = executing.decrementAndGet();
        if(count == 0) {
            this.shutdown();
        }
    }

}

Solution 4:

You could create your own thread pool which extends ThreadPoolExecutor. You want to know when a task has been submitted and when it completes.

public class MyThreadPoolExecutor extends ThreadPoolExecutor {
    private int counter = 0;

    public MyThreadPoolExecutor() {
        super(1, 1, 0, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>());
    }

    @Override
    public synchronized void execute(Runnable command) {
        counter++;
        super.execute(command);
    }

    @Override
    protected synchronized void afterExecute(Runnable r, Throwable t) {
        super.afterExecute(r, t);
        counter--;
        notifyAll();
    }

    public synchronized void waitForExecuted() throws InterruptedException {
        while (counter == 0)
            wait();
    }
}