How does a web server/the http protocol handle version control and compression?

You can read about it in exorbitant detail in the HTTP specification, but here's the gist: when the browser needs to request a file, it first checks its local cache. There are three main possibilities:

  1. The browser has a local (cached) version of the file that is marked as expiring as a certain time, and that time is in the future. In this case, the browser has a choice of either using the cached version as is, or it can send a request to the server to see if the file has changed. If the browser sends the request, it will include an If-Modified-Since header containing the time at which the file was last accessed.
  2. The browser has a cached version of the file that has already expired. In this case, the browser will definitely send a request to the server to see if there's a new version, and that request will (usually) include an If-Modified-Since header containing the time at which the file was last accessed.
  3. The browser doesn't have the file cached at all, in which case it sends a request with no If-Modified-Since header.

When the request gets to the server, there are basically a couple of things that can happen. If the request does not include an If-Modified-Since header, the server will go ahead and send back the file using an HTTP 200 (OK) response code. (Or it'll send a 404 File Not Found, or 403 Forbidden, or whatever is appropriate) But if the request did include an If-Modified-Since header, the server knows that it only has to send back the file if it was modified since the time contained in the header. Now, if the file was modified since that time, then again, the server will send back the file with a code of 200, or 403, 404, whatever. But if the file has not been modified since the specified time - which, remember, means that the browser's cached version is still current - the server can respond with a 304 (Not Modified) code, and leave out the contents of the file itself. This saves some amount of network traffic.

Now, assuming the server is going to respond with the full content of the file, there are a few ways to go about it, depending on how the server is written and/or configured. Obviously, it could just read the file from disk (or run the program to generate it, if it's a dynamic page) every time a request comes in and just send it back, but as you know, that's kind of inefficient. One thing the server can do is send back a gzipped version of the file, if the browser specifies Accept-Encoding: gzip in its request. It does indeed make sense for the server to keep a cached version of the gzipped file, and Apache (and probably most other servers) can be configured to do so. When the server is preparing to send back a gzipped response, it checks the modification time of the gzipped cached version against the modification time of the original file, and if the original file has been updated, it will run gzip on it again and replace the old version in the cache with the new version.

Sometimes servers can also cache files in RAM, if they are frequently requested. I think Apache can be configured to do so but I'm not sure. (As you can probably guess by now, with Apache it's all about the configuration.)

With regard to your question on how files are requested, the browser does actually ask for files one at a time. Each HTML page, CSS file, Javascript file, image file, etc. corresponds to one individual HTTP request. A tool like Wireshark can actually show you the individual HTTP requests and responses going to and from your computer, if you're interested. But to save resources, the TCP/IP connection normally stays open through a whole set of requests. So for example, if you have a web page with 3 images and a CSS stylesheet, you'd probably get a sequence like this:

  • Browser opens connection
  • Server acknowledges connection
  • Browser requests HTML page
  • Server sends HTML page
  • Browser requests CSS stylesheet
  • Server sends CSS stylesheet
  • Browser requests image 1
  • Server sends image 1
  • Browser requests image 2
  • Server sends image 2
  • Browser requests image 3 with Connection: close header
  • Server sends image 3
  • Server closes connection

The Connection: close header can be sent by either side to specify that the TCP/IP connection should be shut down after that request has completed.

Hopefully that gets at mostly what you were asking, but the HTTP specification is a HUGE document and there are a lot of subtleties I've neglected. I actually find it moderately interesting reading so I'd suggest you go take a look at it (then again, I'm probably kind of weird).