ZFS and SAN -- best practices?
FWIW I have experience with up to 92 disks in a single ZFS pool and so far it works fine.
But if you're really talking about several hundreds of disks I would consider partitioning them into a small number of disjunct (but still large) pools. I don't want to know how long e.g. a zpool scrub
runs on a 3000 disk pool (but you want to scrub regularly). Also the output of commands like zpool status
would be unwieldy with such a large number of disks. So why put all eggs into a single basket?
(Side note on dedup: Notice that although dedup can be controlled at the dataset level it will find duplicates at the pool level. I.e. you'll probably get worse dedup results if you're partitioning as suggested. On the other hand you'll need much more memory to hold the dedup hashes of a single giant pool which might not fit into ARC+L2ARC if the pool's too big. So if you are using dedup the amount of available memory is probably a good indicator for the maximum practical pool size.)
We let our SANs manage the RAID. Why spend money on all that battery backed NVRAM and those dedicated processors and then offload the work onto the server, whose CPUs I want doing something other than RAID checksums?