How to forcast the spec of a Linux load balancer?

I would like to build a Linux load balancer with SSL off load with sticky sessions. I would like to do this with either Pound or Pound and HAProxy. I haven't done this before, and I have been wanting to learn about both HAProxy and Pound for a while now. Finally I have a small use case for it to use as my excuse to get my hands dirty.

The site is a forum with a peak of ~4Mbps of throughput, which is a lot of posts and reads I think! So I don't want a high throughput device, I'm more concerned about concurrent users.

I have the following queries though;

  1. Where does the bulk of the work load exist on the load balancer, is it the CPU decoding the SSL traffic, or RAM caching sessions for sticky sessions?
  2. Following on from query 1, I have a spare server I would like to use, but how can I relate server hardware specification required against web application performance required?

    I have a small 1u server (PowerEdge 1850), with 2x76GB 10k Ultra 320 SCSI drives in RAID1, 2x 3Ghz single core Xeons (800Mhz Bus with 2MB L2 cache), and 6x 1GB sticks of PC2-3200 400Mhz RAM. I would like to use this, but I have no experience with HAProxy and Pound so I can't say if this will be apt or not. I would assume 6GBs of RAM is way too much looking at hardware load-balancer specs. What do others think about the CPU and HDDs?

This isn't a shopping thread, so don't post server models that would be suitable. Instead, what I would like if this isn't up to the task, is to go back to query (1) so I can build something else that will be sufficient. I have lots of experience with server deployments, but not these two packages.

Thank you.


Solution 1:

HAProxy and SSL:
SSL support directly in HAProxy is very recent (still in Dev, made public about a week ago), so depending on your timeline, you are going to need something like stunnel or nginx to offload the SSL. If don't mind trying something new, here is a howto.

TPS, Concurrency, and Throughput:
The main factor here is probably going to be Transactions Per Second (TPS). So in order to forecast your load you will need to get this number somehow. Most likely you will want to parse your web logs. Concurrency will really be a function of how long you keep sessions open. If you keep them open for a while, response can seem faster because you don't have to keep redoing session creation (time consuming and expensive when it comes to SSL). However, you don't want to keep things open so long that you use up a bunch of memory.

Estimating Capacity with HAProxy:
When it comes to memory performance, the HAProxy documentation does give some guidance:

Also, keep in mind that a connection contains two buffers of 8kB each, as well as some other data resulting in about 17 kB of RAM being consumed per established connection. That means that a medium system equipped with 1GB of RAM can withstand around 40000-50000 concurrent connections if properly tuned.

With SSL most of the CPU work will be during the handshake phase, if what you are using to process SSL can cache the generated keys, you can save a lot of CPU. See this article for a lot more detail on this.

You can also use this person benchmarks to get an idea as a baseline as well.

You have to benchmark to be sure:
In the end you are going to have to benchmark to be sure, here is a reference to get you started. My personal impression is that at 4 Mbps you are probably going to be fine.