Memcached scaling strategy
Currently I am running a production environment with 4 dedicated memcached servers, each of them having 48Gb of RAM (42 dedicated to memcache). Right now they are doing fine, but traffic and content are growing and will surely be growing next year too.
What are your thoughts on strategies for scaling memcached further? How have you done until now:
Do you add more RAM to the boxes until their full capacity - effectively doubling the cache pool on the same number of boxes? Or do you scale horizontally by adding more of the same boxes, with the same amount of RAM.
The current boxes can surely handle more RAM as their CPU load is quite low, the only bottleneck being memory, but I wonder if it wouldn't be a better strategy to distribute the cache, making things more redundant and minimizing the impact on the cache of losing one box (losing 48Gb of cache versus losing 96Gb). How would you (or have you) handle this decision.
I so want to know what it is you're moving that consumes over 100 GB of memory while not maxing out your NICs.
Memcache scales fairly linearly between machines, so the questions you have to ask are:
- Is my system bus currently saturated?
- This might not relate to CPU usage -- DMA transfers won't show that way
- How expensive is the high-density memory versus a new box containing the increase amount of memory?
- Full cost of rack space, power consumption, etc.
- Do you see a fundamental difference between losing 25% of your cache 1% of the time and 12.5% of your cache 2% of the time? (Randomly chosen failure rate).
Scaling is 10% intuition, 70% measuring and adapting, and 20% going back and trying something else.
Load 'em up until they max out the weakest link or stop being cost-effective. They may or may not already be there.
When I've done this there is usually a break-even between point box size (rack space cost), expense of high density chips and failure scenario handling. This almost always ends up with a configuration less than the maximum memory density (as well as usually not the fastest chips available), which as you mentioned improves impact of node failure and usually makes them more cost effective. Some costs/things to consider when making this choice:
- node cost (cpu/mem/etc)
- rack space cost
- administrative overhead/cost
- failure scenarios (are you trying to do N+1?)
I have also done upgrades to max out boxes as you grow clusters too (usually when they are pretty small), as it may be significantly cheaper in the short term to buy some more memory as you scale to give you more time to make larger architectural decisions.