What sort of web server hardware do you use to handle 100 Mbps+ of static files?
I currently use Amazon S3 for much of my static file serving needs but my monthly bill is getting very expensive. I did some rough calculations using the logs and at peak times, my most expensive Amazon bucket is handling about 100 180 Mbps of traffic. Mostly images under 50K.
S3 is hugely helpful when it comes to storage and redundancy but I don't really need to be paying for bandwidth and GET requests if I can help it. I have plenty of inexpensive bandwidth at my own datacenter, so I configured an nginx server as a caching proxy and then primed the cache with the bulk of my files (about 240 GB) so that my disk wouldn't be writing like crazy on an empty cache.
I tried cutting over and my my server choked.
It looks like my disks were the problem - this machine has 4 x 1 TB SATA disks (Barracuda XT) set up in RAID 10. It's the only thing that I had on hand with enough storage space to be used for this. I'm pretty sure nginx was set up properly as I had already been using it as a caching proxy for another, smaller Amazon bucket. Assuming that this is a reasonable amount of traffic for a single machine, maybe an SSD would be worth a try.
If you handle large amounts of static file serving, what hardware do you use?
additional information
Filesystem: ext4, mounted noatime,barrier=0,data=writeback,nobh (I have battery backup on the controller) Nginx: worker_connections = 4096, worker_rlimit_nofile 16384, worker_processes 8, open_file_cache max=100000 inactive=60m
Solution 1:
I don't think your disk is the issue. First nginx's ncache uses a disk store for cache. So, disk speed is going to be one potential cause of issues depending on how hot/cold your dataset is, however, I see no reason that you couldn't serve 100mb/sec with the hardware you've mentioned - especially if you're using nginx.
First thing I would guess is your # of worker processes was low, your worker_connections were probably way too low, and you probably didn't have your open_file_cache set high enough. However, none of those settings would cause a high IO Wait nor a spike like that. You say that you are serving <50k images and it looks like 1/4 of your set could easily be buffered by the OS. Nginx is surely not configured optimally.
Varnish handles the problem in a slightly different way using RAM rather than disk for its cache.
Much depends on your dataset, but, based on the data you've given, I don't see any reason for disk IO to have spiked like that. Did you check dmesg and the logs to see if one of your drives encountered some IO errors at the time? The only other thing I can think that might have caused that spike was exceeding nginx's filecache which would have caused it to have to go into a FIFO mode opening new files.
Make sure your filesystem is mounted with noatime which should cut a considerable amount of writeops off your workload.
As an example of a machine that regularly handles 800mb/sec:
# uptime
11:32:27 up 11 days, 16:31, 1 user, load average: 0.43, 0.85, 0.82
# free
total used free shared buffers cached
Mem: 8180796 7127000 1053796 0 1152 2397336
-/+ buffers/cache: 4728512 3452284
Swap: 8297568 237940 8059628
Quadcore Xeon:
Intel(R) Xeon(R) CPU X3430 @ 2.40GHz
$ ./bw.pl xxx.xxx 2010-09-01 2010-09-30
bw: 174042.60gb
average 543mb/sec, peaks at 810mb/sec
=== START OF INFORMATION SECTION === Model Family: Seagate Barracuda
7200.12 family Device Model: ST3500418AS Serial Number: 6VM89L1N
Firmware Version: CC38 User Capacity:
500,107,862,016 bytes
Linux 2.6.36-rc5 (xxxxxx) 10/04/2010 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
4.33 0.00 2.40 5.94 0.00 87.33
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 109.61 19020.67 337.28 19047438731 337754190
avg-cpu: %user %nice %system %iowait %steal %idle
8.09 0.00 3.40 10.26 0.00 78.25
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 138.52 21199.60 490.02 106210 2455
avg-cpu: %user %nice %system %iowait %steal %idle
3.74 0.00 3.25 9.01 0.00 84.00
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 125.00 21691.20 139.20 108456 696
avg-cpu: %user %nice %system %iowait %steal %idle
4.75 0.00 3.12 14.02 0.00 78.11
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 154.69 19532.14 261.28 97856 1309
avg-cpu: %user %nice %system %iowait %steal %idle
6.81 0.00 3.36 9.48 0.00 80.36
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 112.80 17635.20 309.00 88176 1545
MRTG:
http://imgur.com/KYGp6.png
Dataset:
# du -sh ads
211.0G ads
# ls|wc -l
679075
Solution 2:
Your. Discs. Suck. Point.
Try getting a lot more and a lot faster discs. SAS comes nicely here, as doe Velociraptors.
That said, the best would be getting... a SSD.
Your discs probably do around 200 IOPS each. With SAS you can get that up to around 450, with Velocidaptors to about 300. A high end SSD can get you... 50.000 (no joke - I really mean 5 0 0 0 0 0 0) IOPS.
Make the math ;) A single SSD, no RAID, would be about 62 times as fast as your Raid 10 ;)
Solution 3:
We're serving about 600 Mbps off of a server with SSDs on the backend, and nginx+varnish on the front. The actual processor is a little Intel Atom; we've got four of them behind a LB doing 600 Mbits/sec each (using DSR). Perhaps not appropriate for every situation, but it's been perfect for our use case.