How do web-servers "listen" to IP addresses, interrupt or polling?
I'm trying to understand the lower details of web servers. I am wondering if a server, say Apache, is continuously polling for new requests or If it works by some sort of interrupt system. If it is an interrupt, what is sparking the interrupt, is it the network card driver?
Solution 1:
The short answer is: some sort of interrupt system. Essentially, they use blocking I/O, meaning they sleep (block) while waiting for new data.
The server creates a listening socket and then blocks while waiting for new connections. During this time, the kernel puts the process into an interruptible sleep state and runs other processes. This is an important point: having the process poll continuously would waste CPU. The kernel is able to use the system resources more efficiently by blocking the process until there is work for it to do.
When new data arrives on the network, the network card issues an interrupt.
Seeing that there is an interrupt from the network card, the kernel, via the network card driver, reads the new data from the network card and stores it into memory. (This must be done quickly and is generally handled inside the interrupt handler.)
The kernel processes the newly arrived data and associates it with a socket. A process that is blocking on that socket will be marked runnable, meaning that it is now eligible to run. It does not necessarily run immediately (the kernel may decide to run other processes still).
At its leisure, the kernel will wake up the blocked webserver process. (Since it is now runnable.)
The webserver process continues executing as if no time has passed. Its blocking system call returns and it processes any new data. Then... go to step 1.
Solution 2:
There are quite a lot of "lower" details.
First, consider that the kernel has a list of processes, and at any given time, some of these processes are running, and some are not. The kernel allows each running process some slice of CPU time, then interrupts it and moves to the next. If there are no runnable processes, then the kernel will probably issue an instruction like HLT to the CPU which suspends the CPU until there is a hardware interrupt.
Somewhere in the server is a system call that says "give me something to do". There are two broad categories of ways this can be done. In the case of Apache, it calls accept
on a socket Apache has previously opened, probably listening on port 80. The kernel maintains a queue of connection attempts, and adds to that queue every time a TCP SYN is received. How the kernel knows a TCP SYN was received depends on the device driver; for many NICs there's probably a hardware interrupt when network data are received.
accept
asks the kernel to return to me the next connection initiation. If queue wasn't empty, then accept
just returns immediately. If the queue is empty, then the process (Apache) is removed from the list of running processes. When a connection is later initiated, the process is resumed. This is called "blocking", because to the process calling it, accept()
looks like a function that doesn't return until it has a result, which might be some time from now. During that time the process can do nothing else.
Once accept
returns, Apache knows that someone is attempting to initiate a connection. It then calls fork to split the Apache process in two identical processes. One of these processes goes on to process the HTTP request, the other calls accept
again to get the next connection. Thus, there's always a master process which does nothing but call accept
and spawn sub-processes, and then there's one sub-process for each request.
This is a simplification: it's possible to do this with threads instead of processes, and it's also possible to fork
beforehand so there's a worker process ready to go when a request is received, thus reducing startup overhead. Depending on how Apache is configured it may do either of these things.
That's the first broad category of how to do it, and it's called blocking IO because the system calls like accept
and read
and write
which operate on sockets will suspend the process until they have something to return.
The other broad way to do it is called non-blocking or event based or asynchronous IO. This is implemented with system calls like select
or epoll
. These each do the same thing: you give them a list of sockets (or in general, file descriptors) and what you want to do with them, and the kernel blocks until its ready to do one of those things.
With this model, you might tell the kernel (with epoll
), "Tell me when there is a new connection on port 80 or new data to read on any of these 9471 other connections I have open". epoll
blocks until one of those things is ready, then you do it. Then you repeat. System calls like accept
and read
and write
never block, in part because whenever you call them, epoll
just told you that they are ready so there'd be no reason to block, and also because when you open the socket or the file you specify that you want them in non-blocking mode, so those calls will fail with EWOULDBLOCK
instead of blocking.
The advantage of this model is that you need only one process. This means you don't have to allocate a stack and kernel structures for each request. Nginx and HAProxy use this model, and it's a big reason they can deal with so many more connections than Apache on similar hardware.