General Advice about an archive solution. ~15tb and growing.

I need a better storage and archive system for my small business's files. Specifically the files are completed video projects. Beyond time and cost limitations what is holding me back is I don't believe in any of the solutions I have pondered. Therefore I am laying outthe problem and my thoughts. I would appreciate any opinions.

Budget: I believe in spending what it takes. That being said, we are a small business. I am hoping I can get out of this for <5k and more around 1-3k. That might be a pipe dream. Just tell me so.

The Problem:

  • Raw video files are huge in filesize. We have accumulated probably 10+tb so far and that is growing fast.
  • Video editing require fast read/write access to files so a central or cloud based file server will not be fast enough. Therefore we probably need an achieve solution for old projects and current projects will have to stay local.
  • We want some sort of redundancy and offsite solution.

What we currently do:

  • We use large, high quality, external hard drives.
  • We always buy in pairs and manually duplicate content. In other words, we work off of one, and duplicate the files to the other which serves as a backup/fall back.
  • These HDs are fast enough with firewire800 or USB3 to directly work off of.
  • Once filled, we set the pair aside.

What's wrong with the current solution:

  • Although the data is duplicated across two drives, these drives are not "backed-up" or stored offsite.
  • Organization across these many external HDs is hard. What project is on what drive? etc.
  • Eventually we are going to have a ridiculous amount of hard drives.
  • Duplication is not RAID.

Options:

A Local Server

  • Buy a rack mount server and a rack mounted hard drive array enclosure, like a Norco, (SAS) (20 bays).
  • All video files would be stored on this server. We could install and pay a cloud service to back up this one computer/server. CrashPlan works on Linux and has no limits on how much data. The har ddrives would be physical drives connected to the server so we get around the "no NAS" rules companies like CrashPlan have. It is not a personal computer so the syncing can run 24/7/365. This would solve the offsite issue.
  • Instead of using an online backup service like CrashPlan we could write a script to sync these files to an Amazon Glacier account.
  • A policy that video peeps work off of external hard drives for current projects but must put the project on this new computer when complete. In other words, continue using external hard drives for current projects and store archived projects on this server.

Cloud based backup services (CrashPlan.com, BackBlaze.com, Carbonite.com)

  • Typically only let you backup an external harddrive that is physically connected to a computer. (no NAS or network drives).
  • Typically they expect a backed up external drive to stay connected to your computer and all data to remain on the drive. If you don't hook up an external harddrive for months, what happens to the backups? If you clean up space by deleting old projects, they will be deleted from the online service too.
  • Requires our users to leave the external harddrives connected to their computer until all data is in the cloud. This can take weeks for a big project.
  • Restoring a project would be very slow due to internet transfer speeds.
  • These cloud backup accounts are usually specific to one user/one computer. So if a harddrive is backed up by one user. Then a second user works on the project, what does that mean?

A Big NAS

  • A NAS is "Network Area Storage". You stick in as many hard drives as it will hold. It will raid them. You can access this via the network connection or maybe USB3/Firewire.
  • Most have an Operating System baked into it. So you can't run other software like cloud based backup services. Nor can you do any customization or run your own software. You get what you buy.
  • Big NASs are pretty expensive and not really that big. You don't find many with more than 4 bays. Currently a big HD is 3tb. So 4bays might be somewhere around <12tb of storage. Not super comfy for the future.

Other ideas are:

  • Tape Backups.
  • Just archive the older projects directly to Amazon Glacier, Skip building a local server to store them.

Thanks for any advice!!! Jed


Tape. Simple like that. Quantum has a SuperSTore system that can handle way more than that and I have seen them for less than your 5000 price point - new. The good thing is that you can pull tapes out for storage so scaling this is going to be quite cost efficient, and tapes last.


First, I would advise avoiding Glacier. It sounds good, until you crunch the costs on actually restoring a large amount of data. This is an unofficial calculator you can use to calculate Glacier storage and retrieval costs, and judge for yourself. Restoring terabytes of data from Glacier is a pretty unattractive prospect.

Second, I would advise that for simple backup purposes, you could get away with a a single NAS server with a lot of drives. It sounds to me like you've only looked at home and small office NAS options, and you should consider a proper NAS offering. Preferring Dell, I would point out Dell's PowerVault NAS Servers, but HP, IBM, SuperMicro, and just about everyone else have similar offerings. I have an older Dell PowerVault NX at home that's serving as my media library, and has twelve 2 TB near-line SAS disks in it. 4 TB nearline SAS drives are available these days too, so you could always fill up a proper NAS server with those. (Or buy a couple NAS servers.)

You could easily use one of these on your local LAN, install backup software of your choice (such as Bacula, if you like free, or any one of a dozen commercial offerings if you want vendor support) and use a large RAID volume as your backup target. You could then use a cloud backup service to backup this NAS server, and have the benefits of local and remote backups. Again, this is what I do at home. Proper NAS server, terabytes of data backedup to a cloud service.

And of course, you could use tape too... buy an LTO tape drive or library - personally, I'll go to great lengths to avoid tape or optical disc media, but they are legitimate options, and may be cheaper than a disk-to-disk solution.

Finally, I would suggest that you need to consider the main drawback of cloud backup services, which is the size of your internet pipe. It may take weeks or months to upload terabytes of data over your internet connection, and/or incur extra fees from your ISP. So while they are a viable option for backing up data, even enterprise data, that's a constraint most people don't consider until they've already hit it.


I think it depends on your budget. If you can only spend ~ $6k you'll need to build your own NAS probably. I'd look at nas4free and what a server costs you. If you can spend $20k, you probably can fill a server with a bunch of disk and a decent RAID card or software RAID under Linux or whatever.

For about $40k you can have a highish end 1U (IBM x3550 M4, 2 port Emulex 10GBit nic, 4 Gbit NIC, 128GB RAM, 2 local 10k SAS disks) with 10Gbit iSCSI to an Infortrend SAN box with 24 4TB SAS disks you can slice and dice however you want. RAID6 is a reasonable config.

Tape is also a good idea, but I don't know how cheap it is really. It depends on how big a library you get. If a 48 tape library is good, you can again do that with a 1U and external SAS card for maybe $30k and 2 LTO6 drives... But then you need software licenses to manage tape backups or something. I've only used NetBackup, which probably isn't a great fit for you here. Just don't forget you'll probably want to drive the tape library some way in software. But once you're out of the library, don't forget about going to find the tape and load it up, plus a staging area for the access...