Load balancing and HTTPS strategies
I am faced with the following problem: Servers get saturated since current load balancing strategy is based on client IP. Some corporate clients access our servers from behind large proxies so all clients appear with same IP to our load balancer. I think we are using some hardware load balancing device (can investigate further if necessary). We need to maintain session affinity (site is constructed in ASP), so all requests with same IP get routed to the same node.
Since all the communication goes over the HTTPS, no request data (like session Id) is available to balancer as a client discriminator. Is there a way to use some other data besides the IP to distinguish between clients and route the clients even when coming from same IP to different nodes?
Note: I need to maintain the traffic between the balancer and nodes safe (encrypted).
Solution 1:
The easiest way of doing this if you currently have a load balancer in place is to decrypt the data on the load balancer and look at a cookie. At that point you can either send the request to the backend server un-ecnrypted or you can re-encrypt it and send it on.
Most setups I know of consider the network connection between the load balancer and the backend server secure and don't bother to re-encrypt the traffic for multiple reasons. One reason is that hardware based load balancers also act as SSL accelerators and this is another reason the HTTPS traffic ends at their door. Another is that it allows the traffic to be inspected for attacks.
Solution 2:
There are three common ways of doing this:
First you can change your load balancer forwarding logic (either keep track of the number of connections to each host & try to distribute load evenly, do a simple round-robin, etc.). Either of the options I mention eliminate the deterministic nature of your current setup (clients from IP X no longer go to server Y), which also eliminates (or reduces) your problem.
Note that you want to implement "sticky sessions" or their equivalent so that once a client is randomly assigned to a back-end server they keep going to the same one as long as their connection is active.
Second you can decrypt the information, read some server identifier from it and then pass it off (either re-encrypting it or passing it in the clear over your back-end network). Note that this isn't really practical at large scale unless your load-balancing hardware is SSL-accelerated (e.g. a Cisco content switch with SSL modules) since the device you're funneling all the traffic through has to do ALL the SSL work.
Per note in the original question, #2 is probably not an option since the traffic needs to be kept encrypted end-to-end (sounds like decrypting on the load balancer would be a policy violation?)
The third method I don't recommend: Setting up split-horizon or round-robin DNS for your target server (either directly pointing to a back-end server or pointing to separate IPs on the load balancer which are statically tied to a back-end, have different balancing pools, etc.) -- This is pretty common in smaller operations as "ghetto load balancing", but in your situation (where you already have a load-balancing equipment) it adds needless complexity compared to the other solutions.
Solution 3:
The only way to do what you want is to terminate SSL on or before your load balancer and then load-balance based on session ID. Open-source solutions for doing both steps in one piece of software would be nginx, haproxy, varnish, any many others.
Some hardware load balancers used to balance based upon SSL session ID, but browsers now re-negotiate SSL sessions, so this no longer works reliably.