How to load balance with respect to memory, disk usage, and other attributes

I've found load balancers such as NGINX but those seem to work only by keeping in mind CPU usage and network traffic. How would I load balance with respect to other variables such as the amount of disk available at each node, or the amount of memory available?

Would I have to write my own request handling service in order to take advantage of those variables when deciding which node to send the request to?

This is my use case, I am building a distributed filesystem for erasure codes and would like the load balancer to send a write of a file to node that can handle the traffic, CPU load, has sufficient disk for the IO operation and has sufficient memory for the expected operation. As I understand, having a load balancer can only tackle the traffic and the CPU portions but how would I further augment the requirements of the load balancer?

Thank you for helping with my first question.


Rob-d mentions that load-balancers must perform health checks on the backend servers to ensure they're healthy and can serve requests. This is absolutely true and I think it's what would enable you to do what you want (checking other metrics and having the LB make routing choices based on those).

Assuming you're load-balancing HTTP, most load-balancers will perform an HTTP GET or HEAD to a certain page to check their status as a viable backend. This page could be a static image, a CSS file, or even an HTML page.

But it could also be a PHP/ASP/Java/Python page. Some might argue that it should even be a page that can perform some sort of sanity check on your application's stack (SQL, NoSQL, helper services, etc..).

There's no reason you couldn't write a script which implements your complex load-balancing algorithm, and simply returns an HTTP/1.1 200 OK, or HTTP/1.1 503 Service Unavailable depending on whether or not the server is able to serve requests.

I know of at least one load-balancer that can perform a secondary agent-check, which is able to return more details than simply UP/DOWN, allowing a server's weight to be dynamically changed at configured intervals based on anything the server's agent decides. I think this would be exactly what you're looking for.


The question you're asking is an extremely important question in relation to load balancing, there are Two primary reasons why we load balance, first and most obvious is to split client requests to 2 or more servers, second to make that service highly available. Having configured the load balancer with these two outcomes in mind we then enter into the realm of whatif? - lets say you load balance two web servers and Apache crashes on host one, using just a load balancing algorithm like round robin the load balancer would still send client requests to the crashed server, so we also need to monitor the clients with 'health checks' - the primary reason for this is to take evasive actions when a health check fails - you can imagine the health checks the load balancer needs to perform on our Apache example - is Apache up? can you ping you gateway? is the disk full? can you reach your DB server? etc. etc.

There are many other advantages to load balancing like caching, sticky sessions, ssl offloading and network routing based on clients IP, geo location or browser, you can use http rewrites and redirects, modify headers and basically anything else you care to mention (the server temperature for example).

As for the metrics you mention these are not health checks but 'performance checks' or 'thesehold states'- the load balancer can of course poll the server for any metric you like and it will route requests based on the parameters you have defined - but load balancers are primarily network devices, the don't poll ram and cpu, something else does (external)and then informs the load balancer that a given threshold has been crossed (e.g RAM > 90% used) the load balancer then raise a semaphore 'don't route new requests to server1' and (the external service) continues to poll server1 until RAM < 90% - but what if all servers report RAM > 90% - you can see how quickly it gets complicated, in cloud load balancing these metrics are used to scale up and down the server pool behind the load balancer dynamically.

have a look here for an overview https://support.f5.com/kb/en-us/products/em/manuals/product/em-health-monitoring-3-0-0/11.html

-I voted your question back up, people should comment when down voting questions.


Typically the loadbalancing component of the loadbalancer is only only aware of either the number of active network connections and/or requests it has sent to a back-end server and knows nothing of the actual load those generate on the back-end system.

The load balancing algorithm you select determines which back-end server will handle the next new connection/request the loadbalancer receives.

The most simple is round-robin, where each subsequent new connection/request goes the next available back-end server.

In addition to round-robin most load balancers also support certain weighted load balancing algorithms that send can send proportionally more or fewer requests/connections to specific predefined back-end servers. (i.e. with back-end servers A with weight 1 and B with weight 2 server A will get to handle 1/3 of all requests and server B 2/3 of the new requests)

By adding a monitoring component certain loadbalancers are capable of dynamically adjusting that weight. i.e. when one back-end server starts slowing down compared to the others, it will dynamically get fewer new connections or fewer new requests.

I would say that adjusting the loadbalancer based on available disk space in a back-end server is definitely a non-standard performance metric. :)

With regards to loadbalancing based on "requirements for the expected operation" that requires deep understanding of the protocol you're designing and do you really want to duplicate such logic in the loadbalancer?