Recommended storage scheme for home server? (LVM/JBOD/RAID 5...)

Are there any guidelines for which storage scheme(s) makes most sense for a multiple-disk home server?

I am assuming a separate boot/OS disk (so bootability is not a concern, this is for data storage only) and 4-6 storage disks of 1-2 TB each, for a total storage capacity in the range 4-12 TB.

The file system is ext4, I expect there will be only one big partition spanning all disks.

As far as I can tell, the alternatives are

individual disks

  • pros: works with any combination of disk sizes; losing a disk loses only the data on that disk; no need for volume management.
  • cons: data management is clumsy when logical units (like a "movies" folder) are larger than the capacity of any single drive.

JBOD span

  • pros: can merge disks of any size.
  • cons: losing a disk loses all data on all disks

LVM

  • pros: can merge disks of any size; relatively simple to add and remove disks.
  • cons: losing a disk loses all data on all disks

RAID 0

  • pros: speed
  • cons: losing one drive loses all data; disks must be same size

RAID 5

  • pros: data survives losing one disk
  • cons: gives up one disk worth of capacity; disks must be same size

RAID 6

  • pros: data survives losing two disks
  • cons: gives up two disks worth of capacity; disks must be same size

I'm primarily considering either LVM or JBOD span simply because it will let me reuse older, smaller-capacity disks when I upgrade the system. The runner-up is RAID 0 for speed.

I'm planning on having full backups to a separate system, so I expect the extra redundancy from RAID levels 5 or 6 won't be important.

Is this a fair representation of the alternatives? Are there other considerations or alternatives I have missed? And what would you recommend?


Solution 1:

Like you I'm going through a rationalisation process with the disks in my home server. I too have a mix of disk sizes resulting from the organic growth of the JBOD setup I have.

I am taking the LVM route for the following reasons.

  1. Its the simplest
  2. It allows me to reuse the disks I already have in the server
  3. I have a complete backup of all the data that I am confident I can restore from
  4. I am not concerned about the recovery time in the event of a disk failure

For me the clinching factors are #3 & #4.

Solution 2:

I'm using Greyhole and it fits almost perfectly to my use case:

  • home server
  • re-use of spare hdds with different brands, models, sizes
  • all hdds space can be seen as one big mount point (like jbod)
  • you can set different shares with different needs of redundancy (ie. Photos=max redundancy, Data=simple redundancy, Movies=zero redundancy)
  • hdds upgrade can be done one at the time (ie. you can remove a 500GB hdd and substitute it with a 4TB hdd expanding your total capacity)
  • the lost of one hdd only lose data with zero redundancy residing on that hdd
  • if an hdd sends early warning that is about to fail (from smart parameters monitoring) I can easily replace it with a different one without loosing data
  • hdds can be moved from sata to USB enclosure without doing anything
  • in fact, storage could be anything: sata hdd, usb hdd, remote network share....
  • (VERY IMPORTANT) if you remove a hdd from the Greyhole system, it is a normally formatted ext4 disk with your data in your folders easily readable from any machine

Limitations:

  • Greyhole is best suited for files written once and read many times. It is not recommended to modify a file in place inside a Greyhole volume, it is better to move a file to another location, modify it there then put it again in the Greyhole volume.
  • Greyhole data must be accessed from Samba shares (even locally).

Solution 3:

well on raid systems not the disks must have the same size...

just the partitions you want to add to the raid, need to have the same size to create a raid...

the strengths of lvm are, that you can easily grow your virtual disk by adding more partitions to it. and you have a snapshotting feature!

you can also combine lvm with raid... so that you have data security and the flexibility of lvm :)

Solution 4:

You can stack block devices in Linux and mixin the value of both Software RAID and LVM which should address all your needs. This can all be accomplished from the non-gui installer.

  • Use a single partition that spans the 99% of the disk [1]
  • Create an MD RAID5 (preferably RAID6) with at least one hot spare
  • Initialize the MD array
  • Create an LVM VG
  • Add each MD device as a Physical Volume to the new VG [2]
  • Proceed to add swap and root logical volumes to VG
  • Format root with choice of filesystem (default is ext4)
  • Continue with installation

[1] I encountered a very nasty fault once on SATA disks that had lots of bad blocks. After using the vendor tool to reconstitute the disk. My once identical set of disks was now unique, the bad drive now had a few blocks less than before the low level format had begun, which of course ruined my partition table and prevented the drive from rejoined the MD RAID set.

Hard drives usually have a "free list" of backup blocks used for just an occasion. My theory is that that list must have been exhausted, and since this wasn't an enterprise disk, instead of failing safe and allowing me the opportunity to send it off for data recovery, it decided to truncate my data.

[2] Never deploy LVM without a fault tolerant backing store. LVM doesn't excel at disaster recovery, you're just asking for heartache and if you get it wrong, data loss. The only time it makes sense is if the VG group is confined to a single disk, like an external usb disk or perhaps an external eSATA RAID. The point is try to deploy your VG around backing stores that can hot plugged as a single unit, or as a virtual single unit which is demonstrated in the MD example above.

Solution 5:

What about http://zfsonlinux.org/

It has the notion of disk pools that you can attach detach drives, I don't know if its production ready but still worth checking out.