Software Alternative to RAID for Home Use

I have a ton of data that I've been keeping 2 complete copies of between my Desktop and Laptop for a couple years now; I figure its about time to buy some more drives and make sure I don't lose any of it due to drive failure and whatnot.

At first I was looking to RAID, but then I started to wonder if there are good purely software solutions to use for redundancy on a PC containing multiple drives. In my mind I picture having one empty drive with a capacity as large as or larger than each other drive; to hold the re-construction of a possible failed drive etc.

Is there anything reliable out there for a purpose like this? Also what is the drive/hardware setup recommended for such a task.

Edit: To clarify, I'm not against RAID and I would be willing to just use Backup solutions, I'm mostly curious to see what other options are out there for the situation I described. Thanks.


Solution 1:

There is software RAID built into new versions of Windows and Linux. Have you considered just mirroring a drive for RAID1?

I'd like to add that simply mirroring a drive does not provide an adequate backup solution. Data corruption can render that setup useless very quickly and with little-to-no warning.

If you're looking for a backup solution, I'm partial to Symantec Backup Exec. We utilize the Desktop and Laptop Option to provide backups for end user machines.

Solution 2:

There are a few ways to tackle this. If you're after availability, RAID's the way to go. If you're after redundancy, RAID or some form of file replication will solve the problem, albeit with caveats.

File replication (eg: rsync or RoboCopy) will give you two (or more) copies of your data at a very specific point in time spread over multiple spindles.

  • Upside: A OS / filesystem failure will not trash your offline replica. This works great for offsite backups.
  • Downside: You either need to build automation or follow a manual process to sync your data; your data is only as fresh as your last sync. You need to be aware of open files, and what they will resemble when the file arrives on the replica. As an example, database files are not safe to copy without snapshotting, quiescing or shutting down the database server as they will be inconsistent.

RAID is an availability technique used to keep your server up when it throws a disk. By virtue of mirroring or parity, the data can be regenerated onto a new disk under limited, well defined failure conditions.

  • Upside: Your server doesn't have to die just because a spindle has.
  • Downside: Can't (sensibly) remove a spindle to form consistent, quiesced storage (yes, I'm aware of ways to do it. Just don't. Please :) ). Any failure mode not explicitly covered by your chosen RAID level and implementation will result in data loss. A RAID card or OS bug can end up silently corrupting data across all disks simultaneously. Depending on the age of the disks, the act of regenerating the array can cause other disks to fail before the rebuild has completed, thus rendering the array useless.

My recommendation is to combine the two: Use a RAID to keep the environment running; replicate the contents to another disk to create offsite storage.

Remember: RAID is not a backup solution

Solution 3:

Both Linux and Windows have built in RAID support. Windows XP and Vista support RAID1 and the server editions support RAID5, and you can set it up through the normal disk management screens.

Linux has full RAID support too (1, 5, 6, 10, and any combination of nested levels such as 1+0, 5+0).

If you want to avoid RAID completely for some reason then you could use rsync to maintain the duplicate copies.

Solution 4:

I have a ton of data that I've been keeping 2 complete copies of between my Desktop and Laptop for a couple years now; I figure its about time to buy some more drives and make sure I don't lose any of it due to drive failure and whatnot.

While you phrased your question in terms of local RAID storage, the fact that you're sharing data between machines makes me wonder if you might be better served by an online backup/syncing solution.

There are a number of online backup solutions that make this rather painless. You might try one of them first, as the time and money investment for them is extremely low.

Dropbox http://getdropbox.com and JungleDisk http://jungledisk.com are two I can personally recommend.

Dropbox is great for automagic syncing between computers. Great if you need local copies of the data at all times. Works great on OSX and Windows (and I believe there's a Linux client too.)

JungleDisk is backed by Amazon's S3 storage service and is more geared towards backups. These backups are available on any machine via a mapped network drive. Additionally you can use this network drive for arbitrary file storage as well. The Pro version enabled features like fast differential backups for a few dollars a month.

I use JungleDisk on a number of production machines and it's extremely affordable; you pay only Amazon's low S3 rates. For approximately 20GB of data storage a month, and considerable throughput, I pay something like $15.

Solution 5:

There are many solutions to this, it really depends on what you are looking to accomplish, and what operating system you would like to use to solve it. I advocate using an opensolaris box with ZFS, because its expandable, cheap, and not to difficult to configure.

If you are more comfortable with linux, than the solution is your favorite distro+mdadm, which IMHO can be a nightmare when something goes wrong if you are using anything other than raid1.

Here are some solaris takes on the issue:

DIY: Home NAS Box with OpenSolaris and ZFS

EON (Embedded Operating system/Networking (EON), RAM based live ZFS NAS appliance is released on Genunix!) - is very cool as it boots off a flash drive, and as long as your hardware is compatible, works like a charm.