What does the suspend function mean in a Kotlin Coroutine?

Suspending functions are at the center of everything coroutines. A suspending function is simply a function that can be paused and resumed at a later time. They can execute a long running operation and wait for it to complete without blocking.

The syntax of a suspending function is similar to that of a regular function except for the addition of the suspend keyword. It can take a parameter and have a return type. However, suspending functions can only be invoked by another suspending function or within a coroutine.

suspend fun backgroundTask(param: Int): Int {
     // long running operation
}

Under the hood, suspend functions are converted by the compiler to another function without the suspend keyword, that takes an addition parameter of type Continuation<T>. The function above for example, will be converted by the compiler to this:

fun backgroundTask(param: Int, callback: Continuation<Int>): Int {
   // long running operation
}

Continuation<T> is an interface that contains two functions that are invoked to resume the coroutine with a return value or with an exception if an error had occurred while the function was suspended.

interface Continuation<in T> {
   val context: CoroutineContext
   fun resume(value: T)
   fun resumeWithException(exception: Throwable)
}

But what does suspend mean?

Functions marked with the suspend keyword are transformed at compile time to be made asynchronous under the hood, even though they appear synchronous in the source code.

The best source to understand this transformation IMO is the talk "Deep Dive into Coroutines" by Roman Elizarov.

This includes the following changes to the function:

  • The return type is changed to Unit, which is how Kotlin represents void functions
  • It gets an additional Continuation<X> argument (where X is the former return type of the function that was declared in the code). This continuation acts like a callback.
  • Its body is turned into a state machine (instead of literally using callbacks, for efficiency). This is done by breaking down the body of the function into parts around so called suspension points, and turning those parts into the branches of a big switch. The state about the local variables and where we are in the switch is stored inside the Continuation object.

This is a very quick way to describe it, but you can see it happen with more details and with examples in the talk. This whole transformation is basically how the "suspend/resume" mechanism is implemented under the hood.

Coroutine or function gets suspended?

At a high level, we say that calling a suspending function suspends the coroutine, meaning the current thread can start executing another coroutine. So, the coroutine is said to be suspended rather than the function.

In fact, call sites of suspending functions are called "suspension points" for this reason.

Which coroutine gets suspended?

Let's look at your code and break down what happens:

// 1. this call starts a new coroutine (let's call it C1).
//    If there were code after it, it would be executed concurrently with
//    the body of this async
async {
    ...
    // 2. this is a regular function call, so we go to computation()'s body
    val deferred = computation()
    // 4. because await() is suspendING, it suspends coroutine C1.
    //    This means that if we had a single thread in our dispatcher, 
    //    it would now be free to go execute C2
    // 7. once C2 completes, C1 is resumed with the result `true` of C2's async
    val result = deferred.await() 
    ...
    // 8. C1 can now keep going in the current thread until it gets 
    //    suspended again (or not)
}

fun computation(): Deferred<Boolean> {
    // 3. this async call starts a second coroutine (C2). Depending on the 
    //    dispatcher you're using, you may have one or more threads.
    // 3.a. If you have multiple threads, the block of this async could be
    //      executed in parallel of C1 in another thread
    // 3.b. If you have only one thread, the block is sort of "queued" but 
    //      not executed right away (as in an event loop)
    //
    //    In both cases, we say that this block executes "concurrently"
    //    with C1, and computation() immediately returns the Deferred
    //    instance to its caller (unless a special dispatcher or 
    //    coroutine start argument is used, but let's keep it simple).
    return async {
        // 5. this may now be executed
        true
        // 6. C2 is now completed, so the thread can go back to executing 
        //    another coroutine (e.g. C1 here)
    }
}

The outer async starts a coroutine. When it calls computation(), the inner async starts a second coroutine. Then, the call to await() suspends the execution of the outer async coroutine, until the execution of the inner async's coroutine is over.

You can even see that with a single thread: the thread will execute the outer async's beginning, then call computation() and reach the inner async. At this point, the body of the inner async is skipped, and the thread continues executing the outer async until it reaches await(). await() is a "suspension point", because await is a suspending function. This means that the outer coroutine is suspended, and thus the thread starts executing the inner one. When it is done, it comes back to execute the end of the outer async.

Does suspend mean that while outer async coroutine is waiting (await) for the inner computation coroutine to finish, it (the outer async coroutine) idles (hence the name suspend) and returns thread to the thread pool, and when the child computation coroutine finishes, it (the outer async coroutine) wakes up, takes another thread from the pool and continues?

Yes, precisely.

The way this is actually achieved is by turning every suspending function into a state machine, where each "state" corresponds to a suspension point inside this suspend function. Under the hood, the function can be called multiple times, with the information about which suspension point it should start executing from (you should really watch the video I linked for more info about that).


To understand what exactly it means to suspend a coroutine, I suggest you go through this code:

import kotlinx.coroutines.Dispatchers.Unconfined
import kotlinx.coroutines.GlobalScope
import kotlinx.coroutines.launch
import kotlin.coroutines.Continuation
import kotlin.coroutines.resume
import kotlin.coroutines.suspendCoroutine

var continuation: Continuation<Int>? = null

fun main() {
    GlobalScope.launch(Unconfined) {
        val a = a()
        println("Result is $a")
    }
    10.downTo(0).forEach {
        continuation!!.resume(it)
    }
}

suspend fun a(): Int {
    return b()
}

suspend fun b(): Int {
    while (true) {
        val i = suspendCoroutine<Int> { cont -> continuation = cont }
        if (i == 0) {
            return 0
        }
    }
}

The Unconfined coroutine dispatcher eliminates the magic of coroutine dispatching and allows us to focus directly on bare coroutines.

The code inside the launch block starts executing right away on the current thread, as a part of the launch call. What happens is as follows:

  1. Evaluate val a = a()
  2. This chains to b(), reaching suspendCoroutine.
  3. Function b() executes the block passed to suspendCoroutine and then returns a special COROUTINE_SUSPENDED value. This value is not observable through the Kotlin programming model, but that's what the compiled Java method does.
  4. Function a(), seeing this return value, itself also returns it.
  5. The launch block does the same and control now returns to the line after the launch invocation: 10.downTo(0)...

Note that, at this point, you have the same effect as if the code inside the launch block and your fun main code are executing concurrently. It just happens that all this is happening on a single native thread so the launch block is "suspended".

Now, inside the forEach looping code, the program reads the continuation that the b() function wrote and resumes it with the value of 10. resume() is implemented in such a way that it will be as if the suspendCoroutine call returned with the value you passed in. So you suddenly find yourself in the middle of executing b(). The value you passed to resume() gets assigned to i and checked against 0. If it's not zero, the while (true) loop goes on inside b(), again reaching suspendCoroutine, at which point your resume() call returns, and now you go through another looping step in forEach(). This goes on until finally you resume with 0, then the println statement runs and the program completes.

The above analysis should give you the important intuition that "suspending a coroutine" means returning the control back to the innermost launch invocation (or, more generally, coroutine builder). If a coroutine suspends again after resuming, the resume() call ends and control returns to the caller of resume().

The presence of a coroutine dispatcher makes this reasoning less clear-cut because most of them immediately submit your code to another thread. In that case the above story happens in that other thread, and the coroutine dispatcher also manages the continuation object so it can resume it when the return value is available.