How do sites such as Google achieve high availability? [closed]
As I understand it, when I open a website such as Google, the hostname is looked up and my browser uses the resulting IP address to connect to the server and retrieve the page.
However, how do high availability websites make sure that this single IP address can always be reached? isn't that a single point of failure?
Solution 1:
There are two common solutions to high availability for web sites: DNS round robin and IP load balancing.
DNS round robin means you get different IP addresses each time you query a DNS server for the site's name; this helps distributing requests across multiple servers, and it also avoids the single point of failure you pointed out. This is the DNS answer for www.google.com
(when asked to one of the authoritative name servers for the "google.com" domain):
> www.google.com
Server: ns1.google.com
Address: 216.239.32.10
www.google.com canonical name = www.l.google.com
www.l.google.com internet address = 74.125.77.99
www.l.google.com internet address = 74.125.77.104
www.l.google.com internet address = 74.125.77.147
Another common solution, which could also be used at the same time (and very likely is in this case), is IP load balancing; i.e. those IP addresses aren't actually assigned to servers, but instead to load balancing devices (or reverse proxies, or any other similar solution), which then forward the requests to one of several back-end servers; should one of those servers fail, another one would be used.
More info here:
http://en.wikipedia.org/wiki/Round_robin_DNS
http://en.wikipedia.org/wiki/Load_balancing_(computing)
Solution 2:
An IP address isn't necessarily a SPOF as it certainly can be re-affected dynamically (a.k.a. fail-over) to a healthy server should the previous one holding it goes wrong.
Solution 3:
Google most likely uses THREE Approaches at the same time:
- At the backend you have a number of servers to serve requests. They haveall their own IP addresses.
- In front of them are Hardware Load balancers that distribute reuqests to servers behind them. They have one public IP each, but may cover 30, 60 or even more physical servers. They are themselves likely redundant from a large manufacturer.
- In front DNS Round Robin is LIKELY used. Allows load sitribution to even more load balanders.
Actually all that is nicely described.
http://en.wikipedia.org/wiki/Google_platform
Note that we talk of HUNDREDS OF THOUSANDS OF SERVERS. MANY data centers full of stuff.
Google is very special in that the servers pretty much are read only. They get a copy of the index, and serve that until they are reimaged with a new updated copy. No updates are ever done to an answering cluster. This is unusual for an applicaiton - but not because google is smart or so, just because their requirements are unusual.