What is the best way to configure a new ZFS server with lots of storage?

Assume the following drive setup (with ZFS):

Controller 1    Drive1    Drive4    Drive7    Drive10
Controller 2    Drive2    Drive5    Drive8    Drive11
Controller 3    Drive3    Drive6    Drive9    Drive12

VDEV setup:

vdev1: drive1, drive2, drive3 vdev2: drive4, drive5, drive6 vdev3: drive7, drive8, drive9 vdev4: drive10, drive11, drive12

Is it better (for reliability) to add all vdevs to the same zpool, or have separate zpools (one for each vdev)? Also, if we lost a single vdev, would we lose the entire array? We don't need all the storage capacity in one place - the smaller zpools would be find from the storage standpoint.

Update: For 3dinfluence's question about the boot pool, that will be on a RAID1 set. I don't like mixing the OS and my multi-terabyte RAID arrays.


The way ZFS works is that inside the pool it has raid sets or groups. To expand capacity you have to add additional groups at the same raid level to the pool. IO is then stripped across all the groups inside the zpool that have free blocks. So a zpool with a large number of small disk sets is fast and highly available.

So I would suggest that you put all the drives in one pool as either 3 disk raidz groups or 6 2 drive mirror groups.

Raidz option

  • ZFS Pool
    • raidz Drive1, Drive2, Drive3
    • raidz Drive4, Drive5, Drive6
    • raidz Drive7, Drive8, Drive9
    • raidz Drive10, Drive11, Drive12

Pros

  • This would allow your setup to survive a controller failure.
  • It also will spread your IO across all 4 groups in the pool which will further improve throughput.

Mirror option

  • ZFS Pool
    • mirror Drive1, Drive2
    • mirror Drive3, Drive4
    • mirror Drive5, Drive6
    • mirror Drive7, Drive8,
    • mirror Drive9, Drive10
    • mirror Drive11, Drive12

Pros

  • This would allow your setup to survive 1 controller failure.
  • It also will spread your IO across all 6 groups in the pool which may be faster.

Cons

  • You will lose 50% of the raw capacity of the drives.

I agree with 3dinfluence's suggestions. However, an IMHO even better solution would be to use RAID-Z2 (similar to RAID6; i.e. two "parity" disks instead of only one) with two pools like this:

  • raidz2 #1: 1,4,2,5,3,6
  • raidz2 #2: 7,10,8,11,9,12

Now, if a single disk fails you still have one "parity" disk left (compared to RAID-Z were you would have lost all your redundancy information already)! If you use high capacity harddisk this might be a good idea because ZFS resilvering (filling the replacement disk, i.e. recreating the redundancy information on the new disk) can take a long time and your data is at risk during all this time - if you use with RAID-Z. (A single controller failure will not bring down a RAID-Z2 pool - but in this case redundancy would be lost.)

Another important issue: You also should think about spare disks - especially if you decide to go with RAID-Z because a single disk failure will leave your data at risk until you replace the failed disk. (Disks prefer to fail Friday night which means you maybe have no redundancy for the entire weekend!)

Keeping the spare disk problem in mind you might even want to use this configuration:

  • raidz2 #1: 1,4,2,5,3
  • raidz2 #2: 7,10,8,6,9
  • spare disks: 11,12

This config would be safer with a rather small loss in total capacity.