Can Elastic Load Balancers correctly distribute traffic to different size instances

As I understand it they just do round robin, evenly distributing connections to the servers behind them.

Kind of, but not quite I think - unfortunately, the Amazon ELB routing documentation falls short of being non existent, so one needs to assemble some pieces to draw a conclusion. Here is the only fragment from the Elastic Load Balancing Developer Guide I'm aware of, see section Sticky Sessions in Overview of Elastic Load Balancing:

By default, a load balancer routes each request independently to the application instance with the smallest load. However, you can use the sticky session feature (also known as session affinity), which enables the load balancer to bind a user's session to a specific application instance. This ensures that all requests coming from the user during the session will be sent to the same application instance. [emphasis mine]

Now what does smallest load mean exactly? Again, the only explanation I'm aware of is the somewhat vague AWS team response from 2009 to ELB Strategy:

ELB loosely keeps track of how many requests (or connections in the case of TCP) are outstanding at each instance. It does not monitor resource usage (such as CPU or memory) at each instance. ELB currently will round-robin amongst those instances that it believes has the fewest outstanding requests. [emphasis mine]

This makes a lot of sense concerning their system architecture and addressed use cases, but obviously doesn't provide the transparency and/or control of routing you may want or need for advanced HA scenarios.

Please note that, depending on interpretation, this may or may not be contradicted a bit by a more recent AWS team response to Elastic Load Balancing - Load distribution policies:

Round-robin does come into play but the client sessions do not always honour TTL's or DNS caches so you can get skewed results and uneven distributions of requests. The ELB does not take into effect what traffic/requests instances have received to-date in there traffic routing decisions. [emphasis mine]

Health Checks

Of course, the above is amended with the properly documented, transparent and controllable health checks, which gives you some leverage to (potentially temporarily) remove instances from being included in routing in the first place, as summarized in the aforementioned AWS team response to ELB Strategy as well:

The load balancer monitors the health of your instances registered with your load balancer. When the load balancer detects a problem with an instance, it stops distributing traffic to it. When the instance is healthy again, the load balancer restarts distributing traffic to it. This process allows your application to automatically react to failed instances without your having to be involved beyond configuring the healthcheck.

Conclusion

While certainly unusual, I don't see why ELB shouldn't work with a pool of different Amazon EC2 instance types as well - I haven't tried this myself though and would recommend both, Monitoring Your Load Balancer Using CloudWatch as well as monitoring your individual EC2 instances and correlate the results in order to gain respective insight and confidence into such a setup eventually.


Based on the statements made up until now, the distribution algorithm is extremely simple.

The front-end of ELB is typically more-than-one ELB instance, and the distribution is round-robin.

The back-end (your instances) algo claims to be:

ELB loosely keeps track of how many requests (or connections in the case of TCP) are outstanding at each instance. It does not monitor resource usage (such as CPU or memory) at each instance. ELB currently will round-robin amongst those instances that it believes has the fewest outstanding requests.

This would infer that if a larger instance has less outstanding requests, then more traffic would be routed to them. There's no way to guarantee this.