Scaling server cluster setup
I'm in middle of re-building a website that currently gets about 4mil+ visitors a month (and that's going straight up lately). It's currently being run and hosted by an external company, but we're dumping them so I need to design hosting.
I'm thinking about building a small cluster (probably on Linode):
One Linode NodeBalancer to balance the load among app servers. It can keep all traffic from a specific client going to one app server, but WP handles sessions via cookies so that's not so important.
Two (or more) app servers - Linode (512?) VPS's running Debian6/Apache2/PHP5/Wordpress, but nginx for caching.
One MySQL (or MariaDB?) database server (again, a VPS), and maybe a slave with HyperDB.
Development is done in-house on a plain old FBSD/Apache2/MySQL/PHP5, deployment would just push new code to all the app servers one at a time and any DB changes to the DB servers.
Backups would be stored locally. We could back up one app server (they should be identical?) at a low-traffic time to keep load down.
E-mail is handled via MailChimp. Easy.
WP itself is running W3 Total Cache with Xcache, I'm considering a CDN for images and other static files, cache headers are already being used for those static files...
The plan is that as we expand, I can just add more app and/or db servers as needed.
In a nutshell: would this setup work? Would it be efficient? I've never built something like this before so I'd like to make sure I'm not missing something.
Just for reference: it's a news website. We run articles in several sections, some media, and visitors can comment on articles, sign up for our email list, etc.
Solution 1:
I am in the process of setting up something similar to your with a different company. I cannot necessarily comment on how Linode works, but want to highlight some things that I've run across when looking into VPS:
I don't know the costs of linode's load balancers, but I found I like control over the configuration of a general VPS running HAProxy or NGINX for load balancing (I chose HAproxy).
Make sure you have enough RAM. Unfortunately the public offerings I've found so far do not let you adjust the amount of RAM and HDD (I need more RAM, but not nearly so much HDD. Oh well).
Make sure your VPS are persistent. You don't want to lose data (especially for MySQL) running on a VPS if it's shutdown or hardware failure. This is MySQL's documentation for EC2, but some of the same concepts apply).
Definitely make sure you're replicating MySQl to a different server, or even multiple slaves. Don't want to lose data.
Definitely make sure you retrieve the backups to an offsite server.
If you're deploying to multiple load-balanced servers (2 or more app servers), have a 'master' server that the backups rsync from to update the code. That simplifies your process of dev-pushing (take master out of load balance scheme, make sure slaves don't rsync from old master, push code to master, ensure everything is working as expected, turn rsyncing back on from master to slaves, then add master back to loadbalance scheme).
I'm sure there's more, but the setup outlined in the question seems fine.
Solution 2:
This setup is very similar to our webserver clusters we use. We use Nginx upstreams on our "balance" server to pass requests to our web servers. I don't see why your above config wouldn't work, and it will allow you to expand horizontally as needed.
- I would also recommend using Heartbeat for HA on the loadbalancers. here is a good reference
Do you currently have any backup/storage in place, or are you using RAID for redundancy? You may consider a NAS for snapshots/backups. Just a thought.