At what point is EBS usage the bottleneck?
Solution 1:
The first thing to keep in mind that will have the most impact on your I/O performance is the instance type that you're using.
Instance Type I/O Performance
------------- ---------------
t1.micro Low
m1.small Moderate
m2.xlarge Moderate
c1.medium Moderate
m1.large High
m1.xlarge High
m2.2xlarge High
m2.4xlarge High
c1.xlarge High
cc1.4xlarge Very High (10 Gigabit Ethernet)
cc2.8xlarge Very High (10 Gigabit Ethernet)
cg1.4xlarge Very High (10 Gigabit Ethernet)
As for EBS volumes and the performance that you'll get, as the AWS FAQ suggests, you'll need to benchmark your application to see what to expect:
Q: What kind of latency and throughput rates can I expect to see from Amazon EBS volumes? The latency from an Amazon EC2 instance to an Amazon EBS volume is similar to the latency you would see from the local Amazon EC2 instance storage drive. I/O rates can vary significantly based on the size of the requests, the randomness of the access patterns, and the caching strategy used by the application. As such, the most accurate measure is to benchmark your specific application on an Amazon EBS volume.
What this means is that the EBS rates you get many not necessarily be worse or better than local instance storage; it really depends on your data access behavior.
Further info is on the AWS EBS page:
Amazon EBS Volume Performance
Amazon EBS volumes are designed to offer higher throughput than Amazon EC2 instance stores for applications performing a lot of random accesses across your data set. You can also attach multiple volumes to an instance and stripe across the volumes to achieve further increases in throughput.
The exact performance will depend on the application (e.g. random vs. sequential I/O or large vs. small request sizes), so the best measure is to benchmark your real applications against the volume. Because Amazon EBS volumes require network access, you will see faster and more consistent throughput performance with larger instances.
Also keep in mind that the I/O performance not only includes the disk IO, but also the network traffic... so, the more network traffic your instance gets the less disk IO you'll get.
Depending on what you're serving, in-memory caching of objects may help considerably if that is possible for your type of application.
Also, here are some blog posts that benchmark the performance of EBS and local (ephemeral) volumes in various RAID configurations and tweaks for getting good IO performance:
EC2 Ephemeral Disks vs EBS Volumes in RAID
Amazon EC2 I/O Performance: Local Ephemeral Disks vs. RAID 0 Striped EBS Volumes
Getting Good IO from Amazon's EBS