10tb file storage design question on media website

I have a fairly busy media website where MP3 audio files are uploaded by members and streamed / downloaded from 2 Windows servers which are load balanced at the moment...both servers simply mirror each other and are kept in sync.

What we currently do is simply add new 2TB HDD's each time the current drive gets full, then users upload data to the new drive...we have enough bays for 24 disks.

We are getting I/O bottleneck on the most recently added HDD because all new media gets added to this drive, which is also the most popular...this could be overcome by spreading the data across each disk, however it gets complicated then when we run out of space and add a new blank drive.

The reason I am mirroring my files is so that I have a 1:1 backup, failover in case 1 server goes down, and so that I can easily Load Balance my site with 2 machines.

Somone previously reccomended using NAS/SAN, I don't have access to this unfortunately.

What would you reccomend in my situation...is there a way I can improve my setup?

I read about Distributed File Systems the other day which sounded like it may fit, however they all seem to be linux only...converting to linux now would be a challenge to say the least as I have little experience.

If i've missed anything that would help you answer please let me know.

Thank You, Paul


Solution 1:

A data load balancing problem. This is fun stuff. Here are some experiences I have had dealing with large sets of data, even if we typically have it spread out over multiple servers.

  1. It sounds like you have not decoupled storage from presentation yet. You need to do this. Design an interface towards your storage (it can be presented as file as a separate server, an NFS share or similar). Personally I am strongly in favor of having a "media" server, which only serves the data. This way you move to the NAS model, and it will save you an enormous amount of pain as you grow.

  2. Once you have media separated from application you can start looking into solutions for how to handle this large amount of data you have.

There are a large number of commercial SAN products. They typically load balance over large amount of disks and handles adding/removing storage well. They are also very expensive, and it sounds like you already have hardware.

On the Linux side there is standard software to handle this amount of data without any problems. LVM and EXT4 can handle very large filesystems (be careful with the FSCK time however). If I were to build this I would probably go LVM, EXT4 and serve the data using Apache. This combination would also let you grow the storage as large as needed.

But that is just general strategies. Now, to attack the specific problem you have. It is a little hard without knowing the implementation details, but I can offer some suggestions:

It sounds like you are not load balancing your IO properly. I assume that you can track which disk serves your data. In that case, you should create a "rebalance" script. When you add a new disk to your system this script takes data from all the old disks and fills up the new disk. Then you can spread the incoming files over all disks and thereby get a better balancing of the IO load. This assume that you have different filesystems on the different disks, and are not just creating an enormous JBOD, which is a bad idea in general.

A second step is to start profiling. Make a small application which logs each file request. If you see a specific disk being hit more than its fair share you swap data between the disk and the least utilized disk. This kind of load balancing is preferably done as a regular job, perhaps every hour or day.

Also, make sure you get large IO caches. What typically kills IO performance in the kind of application you have got is when you serve so many different files that you overwhelm the caches, causing the disk to start trashing. Max out the cache on your disk controllers, and throw as much memory as you can into the system. Windows will happily use spare ram as read cache. It's not hard, or even especially expensive, to stuff more than 128G of ram into a server today. That's a pretty large cache, even if your hot file set is 1TB.

With the amount of data you are serving I would suggest that you stay away from RAID solutions. Rebuilding large raid arrays tends to be a painful experience.