1 EC2 instance per website - manage multiple websites on Amazon cloud using EC2

I'm managing multiple websites, most based on WordPress and all are based on LAMP stack. I'm moving all my websites to Amazon cloud. I'm new to AWS and my plan is moving 1 website by one, starting with my smallest website.

My question is should I put all my site on 1 EC2 instance and or 1 website on 1 separate instance?

This probably sounds stupid since anyone would definitely choose the latter in conventional web hosting context. The reason why I had a thought for the former are:

  • Reusable LAMP stack
    • Can I generate an AMI of my own with LAMP stack ready to use so that I can reused it for my different websites? The reason why I don't go for the community AMIs is that
      • I just don't know which one to use
      • As someone who is not so clueless about Linux or LAMP stack, I would like to have just what I need, no more, no less
  • Cost: Putting every site on 1 single huge instance vs putting each site on a smaller instance. I don't think this should be the case but I thought there would be no harm to ask
  • Difficulty in managing multiple instances

Scalability

I'm sure in a few month times, I will need roughly 3 more times computing power compared to right now (new sites launched, right now I have 6, I will have 10 by then; traffic to existing sites increasing fast). For whatever, let say I decide to go for horizontal scaling e.g. using 3 instances of the same type I'm using at the moment.

So my further question is how this scenario will affect my decision of whether I should separate my sites or put all together on 1/ 1 group of EC2 instance(s)?

I know this probably has to do with the difference between vertical and horizontal scaling on Amazon cloud which I'm still reading. This also probably has to do with knowledge about virtual machines/ servers, of which I'm a complete idiot but will not mind figuring out more if necessary. However, I thought I should just ask since this might have implication on the direction I should be going about Amazon cloud. Feel free to give me a slap if you think I'm such a lazy bump and should do my homework first :)

All help is much appreciated!

Disclaimer: Please advise if this should be posted on superuser.com or any stackexchange site. Thanks


Firstly, some raw data, taken from S. Ostermann, et al, 2010:

Basic instance specs:

+-----------+---------+------+-------+-------+------+-------+---------------+---------------+
|   Name    |  ECUs   | RAM  | Archi |  I/O  | Disk | Cost  |    Reserve    | Reserved Cost |
|           | (Cores) | [GB] | [bit] | Perf. | [GB] | [$/h] | [$/y], [$/3y] | [$/h]         |
+-----------+---------+------+-------+-------+------+-------+---------------+---------------+
| m1.small  | 1 (1)   | 1.7  | 32    | Med   | 160  | 0.1   | 325, 500      | 0.03          |
| m1.large  | 4 (2)   | 7.5  | 64    | High  | 850  | 0.4   | 1300, 200     | 0.12          |
| m1.xlarge | 8 (4)   | 15   | 64    | High  | 1690 | 0.8   | 2600, 4000    | 0.24          |
| c1.medium | 5 (2)   | 1.7  | 32    | Med   | 350  | 0.2   | 650, 1000     | 0.06          |
| c1.xlarge | 20 (8)  | 7    | 64    | High  | 1690 | 0.8   | 2600, 4000    | 0.24          |
+-----------+---------+------+-------+-------+------+-------+---------------+---------------+

Basic performance/cost analysis:

+---------------+------------+----------+--------+-----------+---------+--------+-----------+----------+
|    System     | Peak Perf. |   HPL    | STREAM | RandomAc. | Latency | Bandw. | GFLOP/ECU | GFLOPS/$ |
|               | [GFLOPS]   | [GFLOPS] | [GBps] | [MUPs]    | [µs]    | [GBps] |           |          |
+---------------+------------+----------+--------+-----------+---------+--------+-----------+----------+
| m1.small      | 4.4        | 1.96     | 3.49   | 11.6      | -       | -      | 1.96      | 19.6     |
| m1.large      | 17.6       | 7.15     | 2.38   | 54.35     | 20.48   | 0.7    | 1.79      | 17.9     |
| m1.xlarge     | 35.2       | 11.38    | 3.47   | 168.64    | 17.87   | 0.92   | 1.42      | 14.2     |
| c1.medium     | 22         | 3.91     | 3.84   | 46.73     | 13.92   | 2.07   | 0.78      | 19.6     |
| c1.xlarge     | 88         | 51.58    | 15.65  | 249.66    | 14.19   | 1.49   | 2.58      | 64.5     |
| 16x m1.small  | 70.4       | 27.8     | 11.95  | 77.83     | 68.24   | 0.1    | 1.74      | 17.4     |
| 16x c1.xlarge | 1408       | 425.82   | 16.38  | 207.06    | 45.2    | 0.75   | 1.33      | 33.3     |
+---------------+------------+----------+--------+-----------+---------+--------+-----------+----------+

Actual performance is usually under 50% of the theoretical performance. The one set of values that might be suspect are those for c1.medium, which don't quite agree with the expected results (e.g. bandwidth).

The primary cost to EC2 for a typical workload is the cost of instances - other costs (bandwidth, provisioned storage, etc) are typically under 25% of the total cost. One doesn't expect performance to scale perfectly - and that is evident from the data above. Especially with regard to horizontal scaling, it seems that as you add more compute capacity, the efficiency drops off significantly.

Given the above, and keeping in mind that there are other factors beyond raw compute performance (e.g. I/O performance, memory, etc) it stands to reason that vertical scaling is the most economical approach.

Unfortunately, there are other considerations beyond simply the economics of the scenario. Reliability being a key. With a single instance, the failure of that instance takes down your entire setup. One possible solution may be auto-scaling (i.e. maintaining an instance count of 1), however a single instance is still prone to problems that may occur in a given availability zone, etc.

At some point it is necessary to scale horizontally - the question simply becomes one of when is the ideal time. I would probably suggest: - Scaling vertically at least a few instance sizes (much more so if you start with an t1.micro) - Separate your databases to separate instances (because they don't scale the same way as your web servers) - Scale horizontally until you have a bit of redundancy - Scale vertically until you reach the maximum instance size - Scale horizontally thereafter (possibly using smaller instances initially)

Getting back to the questions at hand - running a single website per instance (or per set of instances) will always be more expensive. In addition to the fixed costs being higher (e.g. one load balancer per website, instead of just a single load balancer), you will not make use of your instances as efficiently (i.e. one website may see high load at a time when other websites are mostly idle - which means that you have some instances overloaded, and others sitting idle). In terms of logistics, the problem might not be as bad as one would imagine - the main problem comes down to managing everything independantly (which you might avoid with some configuration management tools (e.g. Puppet/Chef), but that is usually not a step taken until your setup gets to be a bit larger).

On the other hand, one of the limitations of EC2 instances is that you can only assign a single public IP address to a given instance (which has some implications for certain SSL setups).

You can certainly generate your own AMIs - it is fairly standard practise actually. I usually start with Amazon's Linux AMI as I find it to be one with the least overhead (quite easy on resources, and fast) and the best supported by AWS (it is regularly updated, etc) - that and I prefer the RHEL/CentOS distributions (on which Amazon's Linux is based) to the Debian/Ubuntu ones that are the other popular choice. Once you have customized an instance, you can take snapshots of your EBS volume(s) and register an AMI - passing the snapshot ID as the image on which to base the root volume. In theory you can customize your operating system much more, even to the extent of building your own distribution (but still using the Amazon kernels) - however, unless you have a very specific use case, that is unlikely to be particularly beneficial. My personal preference for running Wordpress, is Varnish + Nginx + PHP-FPM (and W3TC for Wordpress) - I find it is much easier on resources than the typical LAMP stack.

Finally, to address scaling once more. Beyond the basic economics of the problem discussed above, the difficulty comes down to making multiple instances 'appear' as one. This includes ensuring that every instance will serve the same data, load balancing between your instances, and perhaps handling details like PHP sessions. It will be more difficult to do if every site runs on its own set of instances - but likely not by a significant margin (since you will have configured the functionality into your AMI, hopefully). Multiple instances, does however, mean a more complex system, and more things to keep an eye on. (There are quite a few questions on ServerFault on this topic, such as this, this, or this - if you need details on how to scale a specific setup, please ask it as another question).

As a concluding comment - unless your setup has particular needs for an individual site to run on its own instance/cluster (e.g. a vastly different configuration/requirements), I would favour running multiple sites on a single instance/cluster as it is simpler to scale, more economic and efficient, and is more aligned with the spirit of 'cloud computing' (i.e. shared resources).

References:

  • S. Ostermann, et al. A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing, 2010
  • A. Iosup, et al., Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing, 2010 - see Table 9 for more details on horizontal scaling.