What exactly is the process of warming up a load balancer? How does it help before a huge load process? How is it different from Simply upscaling a cluster behind the LB beforehand.

Do CSPs do something special when we make a pre-warmup request?

GCP and AWS use the term warmup differently.


I'm compiling an answer based on the comments on the post, thanks to @Tim and @Michael Hampton

From Petrutandrei's blog-

The ELB is designed to handle large loads of traffic (20kb/sec) without a problem when this traffic increases gradually over a long period of time (several hours). However, when you expect a high increase in traffic over a short period of time, then you face a problem.

AWS considers that if the traffic increases more than 50% in less than 5 minutes then it means that the traffic is sent to the load balancer at a rate that increases faster than the ELB can scale up to meet it. In such cases, one needs to contact AWS to do an operation called “pre-warming”.

So scaling up a cluster even before the actual load spikes in is called warming up. However, the load balancer in itself is a virtual machine or a service hosted somewhere which does the load balancing logic. The resources doing the load. balancing themselves need to scale up rapidly.

Only increasing the cluster size increases the target for the load but not the resources to do the actual load balancing action. This can only be done from CSP's side or by creating equivalent artificial load from your own end. This is called as pre-warming

More information on AWS website [https://aws.amazon.com/articles/best-practices-in-evaluating-elastic-load-balancing/#pre-warming][1]