Explain http keep-alive mechanism

Keep-alives were added to HTTP to basically reduce the significant overhead of rapidly creating and closing socket connections for each new request. The following is a summary of how it works within HTTP 1.0 and 1.1:

HTTP 1.0 The HTTP 1.0 specification does not really delve into how Keep-Alive should work. Basically, browsers that support Keep-Alive appended an additional header to the request as [edited for clarity] explained below:

When the server processes the request and generates a response, it also adds a header to the response:

Connection: Keep-Alive

When this is done, the socket connection is not closed as before, but kept open after sending the response. When the client sends another request, it reuses the same connection. The connection will continue to be reused until either the client or the server decides that the conversation is over, and one of them drops the connection.

The above explanation comes from here. But I don't understand one thing

When this is done, the socket connection is not closed as before, but kept open after sending the response.

As I understand we just send tcp packets to make requests and responses, how this socket connection helps and how does it work? We still have to send packets, but how can it somehow establish the persistent connection? It seems so unreal.

Solution 1:

There is overhead in establishing a new TCP connection (DNS lookups, TCP handshake, SSL/TLS handshake, etc). Without a keep-alive, every HTTP request has to establish a new TCP connection, and then close the connection once the response has been sent/received. A keep-alive allows an existing TCP connection to be re-used for multiple requests/responses, thus avoiding all of that overhead. That is what makes the connection "persistent".

In HTTP 0.9 and 1.0, by default the server closes its end of a TCP connection after sending a response to a client. The client must close its end of the TCP connection after receiving the response. In HTTP 1.0 (but not in 0.9), a client can explicitly ask the server not to close its end of the connection by including a Connection: keep-alive header in the request. If the server agrees, it includes a Connection: keep-alive header in the response, and does not close its end of the connection. The client may then re-use the same TCP connection to send its next request.

In HTTP 1.1, keep-alive is the default behavior, unless the client explicitly asks the server to close the connection by including a Connection: close header in its request, or the server decides to includes a Connection: close header in its response.

Solution 2:

Let's make an analogy. HTTP consists in sending a request and getting the response. This is similar to asking someone a question, and receiving a response.

The problem is that the question and the answer need to go through the network. To communicate through the network, TCP (sockets) is used. That's similar to using the phone to ask a question to someone and having this person answer.

HTTP 1.0 consists, when you load a page containing 2 images for example, in

  • make a phone call
  • ask for the page
  • get the page
  • end the phone call
  • make a phone call
  • ask for the first image
  • get the first image
  • end the phone call
  • make a phone call
  • ask for the second image
  • get the second image
  • end the phone call

Making a phone call and ending it takes time and resources. Control data (like the phone number) must transit over the network. It would be more efficient to make a single phone call to get the page and the two images. That's what keep-alive allows doing. With keep-alive, the above becomes

  • make a phone call
  • ask for the page
  • get the page
  • ask for the first image
  • get the first image
  • ask for the second image
  • get the second image
  • end the phone call

Solution 3:

This is is indeed networking question, but it may be appropriate here after all.

The confusion arises from distinction between packet-oriented and stream-oriented connections.

Internet is often called "TCP/IP" network. At the low level (IP, Internet Protocol) the Internet is packet-oriented. Hosts send packets to other hosts.

However, on top of IP we have TCP (Transmission Control Protocol). The entire purpose of this layer of the internet is to hide the packet-oriented nature of the underlying medium and to present the connection between two hosts (hosts and ports, to be more correct) as a stream of data, similar to a file or a pipe. We can then open a socket in the OS API to represent that connection, and we can treat that socket as a file descriptor (literally an FD in Unix, very similar to file HANDLE in Windows).

Most of the rest of Internet client-server protocols (HTTP, Telnet, SSH, SMTP) are layered on top of TCP. Thus a client opens a connection (a socket), writes its request (which is transmitted as one or more pockets in the underlying IP) to the socket, reads the response from a socket (and the response can contain data from multiple IP packets as well) and then... Then the choice is to keep the connection open for the next request or to close it. Pre-KeepAlive HTTP always closed the connection. New clients and servers can keep it open.

The advantage of KeepAlive is that establishing a connection is expensive. For short requests and responses it may take more packets than the actual data exchange.

The slight disadvantage may be that the server now has to tell the client where the response ends. The server cannot simply send the response and close the connection. It has to tell the client: "read 20KB and that will be the end of my response". Thus the size of the response has to be known in advance by the server and communicated to the client as part of higher-level protocol (e.g. Content-Length: in HTTP). Alternatively, the server may send a delimiter to specify the end of the response - it all depends on the protocol above TCP.