Golang blocking and non blocking

I am somewhat confused over how Go handles non blocking IO. API's mostly look synchronous to me, and when watching presentations on Go, its not uncommon to hear comments like "and the call blocks"

Is Go using blocking IO when reading from files or network? Or is there some kind of magic that re-writes the code when used from inside a Go Routine?

Coming from a C# background, this feels very non intuitive, in C# we have the await keyword when consuming async API's. Which clearly communicates that the API can yield the current thread and continue later inside a continuation.

So TLDR; Will Go block the current thread when doing IO inside a Go routine, or will it be transformed into a C# like async await state machine using continuations?


Solution 1:

Go has a scheduler that lets you write synchronous code, and does context switching on its own and uses async IO under the hood. So if you're running several goroutines, they might run on a single system thread, and when your code is blocking from the goroutine's view, it's not really blocking. It's not magic, but yes, it masks all this stuff from you.

The scheduler will allocate system threads when they're needed, and during operations that are really blocking (I think file IO is blocking for example, or calling C code). But if you're doing some simple http server, you can have thousands and thousands of goroutine using actually a handful of "real threads".

You can read more about the inner workings of Go here:

https://morsmachine.dk/go-scheduler

Solution 2:

You should read @Not_a_Golfer answer first and the link he provided to understand how goroutines are scheduled. My answer is more like a deeper dive into network IO specifically. I assume you understand how Go achieves cooperative multitasking.

Go can and does use only blocking calls because everything runs in goroutines and they're not real OS threads. They're green threads. So you can have many of them all blocking on IO calls and they will not eat all of your memory and CPU like OS threads would.

File IO is just syscalls. Not_a_Golfer already covered that. Go will use real OS thread to wait on a syscall and will unblock the goroutine when it returns. Here you can see file read implementation for Unix.

Network IO is different. The runtime uses "network poller" to determine which goroutine should unblock from IO call. Depending on the target OS it will use available asynchronous APIs to wait for network IO events. Calls look like blocking but inside everything is done asynchronously.

For example, when you call read on TCP socket goroutine first will try to read using syscall. If nothing is arrived yet it will block and wait for it to be resumed. By blocking here I mean parking which puts the goroutine in a queue where it awaits resuming. That's how "blocked" goroutine yields execution to other goroutines when you use network IO.

func (fd *netFD) Read(p []byte) (n int, err error) {
    if err := fd.readLock(); err != nil {
        return 0, err
    }
    defer fd.readUnlock()
    if err := fd.pd.PrepareRead(); err != nil {
        return 0, err
    }
    for {
        n, err = syscall.Read(fd.sysfd, p)
        if err != nil {
            n = 0
            if err == syscall.EAGAIN {
                if err = fd.pd.WaitRead(); err == nil {
                    continue
                }
            }
        }
        err = fd.eofError(n, err)
        break
    }
    if _, ok := err.(syscall.Errno); ok {
        err = os.NewSyscallError("read", err)
    }
    return
}

https://golang.org/src/net/fd_unix.go?s=#L237

When data arrives network poller will return goroutines that should be resumed. You can see here findrunnable function that searches for goroutines that can be run. It calls netpoll function which will return goroutines that can be resumed. You can find kqueue implementation of netpoll here.

As for async/wait in C#. async network IO will also use asynchronous APIs (IO completion ports on Windows). When something arrives OS will execute callback on one of the threadpool's completion port threads which will put continuation on the current SynchronizationContext. In a sense, there are some similarities (parking/unparking does looks like calling continuations but on a much lower level) but these models are very different, not to mention the implementations. Goroutines by default are not bound to a specific OS thread, they can be resumed on any one of them, it doesn't matter. There're no UI threads to deal with. Async/await are specifically made for the purpose of resuming the work on the same OS thread using SynchronizationContext. And because there're no green threads or a separate scheduler async/await have to split your function into multiple callbacks that get executed on SynchronizationContext which is basically an infinite loop that checks a queue of callbacks that should be executed. You can even implement it yourself, it's really easy.