Implementing coroutines in Java

This question is related to my question on existing coroutine implementations in Java. If, as I suspect, it turns out that there is no full implementation of coroutines currently available in Java, what would be required to implement them?

As I said in that question, I know about the following:

  1. You can implement "coroutines" as threads/thread pools behind the scenes.
  2. You can do tricksy things with JVM bytecode behind the scenes to make coroutines possible.
  3. The so-called "Da Vinci Machine" JVM implementation has primitives that make coroutines doable without bytecode manipulation.
  4. There are various JNI-based approaches to coroutines also possible.

I'll address each one's deficiencies in turn.

Thread-based coroutines

This "solution" is pathological. The whole point of coroutines is to avoid the overhead of threading, locking, kernel scheduling, etc. Coroutines are supposed to be light and fast and to execute only in user space. Implementing them in terms of full-tilt threads with tight restrictions gets rid of all the advantages.

JVM bytecode manipulation

This solution is more practical, albeit a bit difficult to pull off. This is roughly the same as jumping down into assembly language for coroutine libraries in C (which is how many of them work) with the advantage that you have only one architecture to worry about and get right.

It also ties you down to only running your code on fully-compliant JVM stacks (which means, for example, no Android) unless you can find a way to do the same thing on the non-compliant stack. If you do find a way to do this, however, you have now doubled your system complexity and testing needs.

The Da Vinci Machine

The Da Vinci Machine is cool for experimentation, but since it is not a standard JVM its features aren't going to be available everywhere. Indeed I suspect most production environments would specifically forbid the use of the Da Vinci Machine. Thus I could use this to make cool experiments but not for any code I expect to release to the real world.

This also has the added problem similar to the JVM bytecode manipulation solution above: won't work on alternative stacks (like Android's).

JNI implementation

This solution renders the point of doing this in Java at all moot. Each combination of CPU and operating system requires independent testing and each is a point of potentially frustrating subtle failure. Alternatively, of course, I could tie myself down to one platform entirely but this, too, makes the point of doing things in Java entirely moot.

So...

Is there any way to implement coroutines in Java without using one of these four techniques? Or will I be forced to use the one of those four that smells the least (JVM manipulation) instead?


Edited to add:

Just to ensure that confusion is contained, this is a related question to my other one, but not the same. That one is looking for an existing implementation in a bid to avoid reinventing the wheel unnecessarily. This one is a question relating to how one would go about implementing coroutines in Java should the other prove unanswerable. The intent is to keep different questions on different threads.


I would take a look at this: http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html, its pretty interesting and should provide a good place to start. But of course we are using Java so we can do better (or maybe worse because there are no macros :))

From my understanding with coroutines you usually have a producer and a consumer coroutine (or at least this is the most common pattern). But semantically you don't want the producer to call the consumer or visa-versa because this introduces an asymmetry. But given the way stack based languages work we will need to have someone do the calling.

So here is a very simple type hierarchy:

public interface CoroutineProducer<T>
{
    public T Produce();
    public boolean isDone();
}

public interface CoroutineConsumer<T>
{
    public void Consume(T t);
}

public class CoroutineManager
{
    public static Execute<T>(CoroutineProducer<T> prod, CoroutineConsumer<T> con)
    {
        while(!prod.IsDone()) // really simple
        {
            T d = prod.Produce();
            con.Consume(d);
        }
    }
}

Now of course the hard part is implementing the interfaces, in particular it is difficult to break a computation into individual steps. For this you would probably want a whole other set of persistent control structures. The basic idea is that we want to simulate non-local transfer of control (in the end its kinda like we're simulating a goto). We basically want to move away from using the stack and the pc (program-counter) by keeping the state of our current operations in the heap instead of on the stack. Therefore we are going to need a bunch of helper classes.

For example:

Let's say that in an ideal world you wanted to write a consumer that looked like this (psuedocode):

boolean is_done;
int other_state;
while(!is_done)
{
    //read input
    //parse input
    //yield input to coroutine
    //update is_done and other_state;
}

we need to abstract the local variable like is_doneand other_state and we need to abstract the while loop itself because our yield like operation is not going to be using the stack. So let's create a while loop abstraction and associated classes:

enum WhileState {BREAK, CONTINUE, YIELD}
abstract class WhileLoop<T>
{
    private boolean is_done;
    public boolean isDone() { return is_done;}
    private T rval;
    public T getReturnValue() {return rval;} 
    protected void setReturnValue(T val)
    {
        rval = val;
    }


    public T loop()
    {
        while(true)
        {
            WhileState state = execute();
            if(state == WhileState.YIELD)
                return getReturnValue();
            else if(state == WhileState.BREAK)
                    {
                       is_done = true;
                return null;
                    }
        }
    }
    protected abstract WhileState execute();
}

The Basic trick here is to move local variables to be class variables and turn scope blocks into classes which gives us the ability to 're-enter' our 'loop' after yielding our return value.

Now to implement our producer

public class SampleProducer : CoroutineProducer<Object>
{
    private WhileLoop<Object> loop;//our control structures become state!!
    public SampleProducer()
    {
        loop = new WhileLoop()
        {
            private int other_state;//our local variables become state of the control structure
            protected WhileState execute() 
            {
                //this implements a single iteration of the loop
                if(is_done) return WhileState.BREAK;
                //read input
                //parse input
                Object calcluated_value = ...;
                //update is_done, figure out if we want to continue
                setReturnValue(calculated_value);
                return WhileState.YIELD;
            }
        };
    }
    public Object Produce()
    {
        Object val = loop.loop();
        return val;
    }
    public boolean isDone()
    {
        //we are done when the loop has exited
        return loop.isDone();
    }
}

Similar tricks could be done for other basic control flow structures. You would ideally build up a library of these helper classes and then use them to implement these simple interfaces which would ultimately give you the semantics of co-routines. I'm sure everything I've written here can be generalized and expanded upon greatly.


I'd suggest to look at Kotlin coroutines on JVM. It falls into a different category, though. There is no byte-code manipulation involved and it works on Android, too. However, you will have to write your coroutines in Kotlin. The upside is that Kotlin is designed for interoperability with Java in mind, so you can still continue to use all your Java libraries and freely combine Kotlin and Java code in the same project, even putting them side-by-side in the same directories and packages.

This Guide to kotlinx.coroutines provides many more examples, while the coroutines design document explains all the motivation, use-cases and implementation details.


Kotlin uses the following approach for co-routines
(from https://kotlinlang.org/docs/reference/coroutines.html):

Coroutines are completely implemented through a compilation technique (no support from the VM or OS side is required), and suspension works through code transformation. Basically, every suspending function (optimizations may apply, but we'll not go into this here) is transformed to a state machine where states correspond to suspending calls. Right before a suspension, the next state is stored in a field of a compiler-generated class along with relevant local variables, etc. Upon resumption of that coroutine, local variables are restored and the state machine proceeds from the state right after suspension.

A suspended coroutine can be stored and passed around as an object that keeps its suspended state and locals. The type of such objects is Continuation, and the overall code transformation described here corresponds to the classical Continuation-passing style. Consequently, suspending functions take an extra parameter of type Continuation under the hood.

Check out the design document at https://github.com/Kotlin/kotlin-coroutines/blob/master/kotlin-coroutines-informal.md


I just came across this question and just want to mention that i think it might be possible to implement coroutines or generators in a similar way C# does. That said i don't actually use Java but the CIL has quite similar limitations as the JVM has.

The yield statement in C# is a pure language feature and is not part of the CIL bytecode. The C# compiler just creates a hidden private class for each generator function. If you use the yield statement in a function it has to return an IEnumerator or an IEnumerable. The compiler "packs" your code into a statemachine-like class.

The C# compiler might use some "goto's" in the generated code to make the conversion into a statemachine easier. I don't know the capabilities of Java bytecode and if there's something like a plain unconditional jump, but at "assembly level" it's usually possible.

As already mentioned this feature has to be implemented in the compiler. Because i have only little knowledge about Java and it's compiler i can't tell if it's possible to alter / extend the compiler, maybe with a "preprocessor" or something.

Personally i love coroutines. As a Unity games developer i use them quite often. Because i play alot of Minecraft with ComputerCraft i was curious why coroutines in Lua (LuaJ) are implemented with threads.


There is also Quasar for Java and Project Loom at Oracle where extensions are made to the JVM for fibers and continuations. Here is a presentation of Loom on Youtoube. There are several more. Easy to find with a little searching.