networked storage for a research group, 10-100 TB

this is related to this post:

Scalable (> 24 TB) NAS for research department

but perhaps a little more general.

Background:

We're a research lab of around 10 people who do a lot of experiments that involve taking pictures at one of several lab setups and then analyzing it an one of several lab computers. Each experiment may produce 2 or 3 GB of data, and we are generating data at the rate of about 10 TB/year.

Right now, we are storing the data on a 6-bay netgear readynas pro, but even with 2 TB drive, this only gives us 10 TB of storage. Also, right now we are not backing up at all. Our short term backup plan is to get a second readynas, put it in a different building and mirror the one drive onto the other. Obviously, this is somewhat non-ideal.

Our options:

1) We can pay our university $400/ TB /year for "backed up" online storage. We trust them more than we trust us, but not a whole lot.

2) We can continue to buy small NASs and mirror them between offices. One limit, although stupid, is that we don't have an unlimited number of ethernet jacks.

3) We can try to implement our own data storage solution, which is why I'm asking you guys.

One thing to consider is that we're a very transient population and none of us are network administration experts. I will probably be here only another year or so, and graduate students, who are here the longest, have a 5-6 year time scale. So nothing can require expert oversight.

Our data transfer rates are low - most of the data will just sit on the server waiting for someone to look at it once or twice - so we don't need a really high speed system.

Given these contraints, can someone recommend a fairly low-cost, scalable, more or less turn key shared data storage system with backup in a separate physical location. Does such a thing exist or should we just pay the university to take care of it for us?

As a second question, our professor just got tenure and is putting together a budget. Here the goal is to ask for as much as you can and hope you get a fraction of it. So the same question, minus the low-cost. Without budget constraints, can you recommend a scalable turn-key backed up storage system.

Thanks


Solution 1:

There's an excellent and extremely detailed article on building NAS "pods" by a company who developed the system for its own use, at http://www.backblaze.com/petabytes-on-a-budget-how-to-build-cheap-cloud-storage.html . They describe it as "67 TB for $7,867", which is very good going. They run JFS on top of RAID-6 volumes under Debian; they then offer that via https, but there's nothing to stop you putting (eg) SaMBa in there instead (you don't say what your current remote-file-access protocol is).

Disclaimer: I know nothing about these people except what I have read, and I haven't tried to build one of these myself. Nevertheless, unless they have been faking photos, they really do build and deploy a bunch of these things, and they haven't yet gone out of business.

Edit: it took me a little longer to find the specific supplier list (the detailed parts list is in the original link above), but it's at http://blog.backblaze.com/2009/10/07/backblaze-storage-pod-vendors-tips-and-tricks/#more-199 . I really do admire the way these guys have thrown open their detailed infastructure for reuse; but as they say in the original posting:

Finally, we thank the thousands of engineers who slaved away for millions of hours to bring us the pod components that are either inexpensive or totally free, such as the Intel Processor, Gigabit Ethernet, ridiculously dense hard drives, Linux, Tomcat, JFS, etc. We realize we’re standing on the shoulders of giants.

I don't know about their product (I have my own tape stacker for backups) but I approve of their humility.

Solution 2:

I think this is one of the cases where you should outsource it. Let the university IT department handle the storage, they take care of the backup and maintenance of the storage solution. It will be better in the long run.