When you exhaust a 1 Gpbs connection to your main haproxy, how do you scale out?
You add another port.
Say your existing network looks like one public IP address in front of one HAProxy box in front of N backend servers. You run through (or better yet approach running through 1 Gbps) of throughput but your backend servers are still healthy with spare resources.
The next step is to get a second public IP address and another HAProxy machine in front of you cluster. Add some Global Server Load Balancing out in front to send traffic in some configurable way to each of your two front end HAProxy boxes.
We've done this by managing our own Ketama Hash based Power DNS server. There are also DNS services that provide programmable DNS responses based on geographic location or other criteria.
Assuming that you don't have the means to get a faster uplink or otherwise scale the existing HAProxy device, then scale to multiple.
You can split the load between them in a few different ways:
- DNS round robin. This involves just adding extra
A
records to the existing DNS name, and should hopefully split the request load semi-evenly around the members of theA
record. - Selective DNS answering. Responds to different DNS requests with different answers depending on criteria - it could simply enforce round-robin distribution, or, if your application can be scaled to new locations effectively, it can answer queries with the closest available instance of the application for a given client (geographically-aware DNS).
- BGP Anycast. Generally considered "not such a good idea" for session-aware communications as topology changes can break TCP sessions, this would be another method of pushing traffic to a deployment of the application that's close to the user from an internet topology perspective.