Why would one use RAID 1 and LVM for a file server?
Im a part time systems administrator and there is little formal documentation. Our main file server was set up in 2010 by somebody who is not even at the institute any more. The requirement at the time was to have three shared filesystems: /home
with quota per user and backup, /usr/remote
for large software packages and a /scratch
without a quota and no backup. Digging through the disks I found an LVM and a RAID controller with a proprietary software.
The server apparently has 14 disks with 1 TB capacity each. They are linked up in pairs using RAID 1. The RAID controller (Areca Technology Corp. ARC-1280/1280ML 24-Port PCI-Express to SATA II RAID Controller) is capable of RAID 5 and 6 (specs), but somehow that has not been used. Then one pair is used as the root partition for that server, /
. The other six pairs are linked into a single LVM group and that group then has the three partitions.
I asked about the reasoning and one of the senior people could only tell me that the person who set it up supposedly knew a fair share of LVM and had really thought about this. But I fail to see the advantages of this setup. I only see disadvantages:
We have 12 disks with 1 TB of raw storage, with the pairing up we end up with only 6 TB of actually usable space. Using RAID 5 with all of these disks we would have gotten 11 TB of storage, with RAID 6 it would have been 10 TB. Even if we took two sets of six disks into RAID 6, then it would still be 8 TB of usuable space.
The worst case disk failure is a whole pair, then part of the LVM is lost and I would think that basically the system would be screwed then. The best case disk failure is one of each pair, so that would be seven broken disks. So from a redundancy standpoint we are not better than RAID 5, given the worst case.
With RAID 6 we would have two redundancies and still could use much more space compared to the current setup. So what is the crucial advantage of this that led people to set it up in this way?
With RAID 6 we would have two redundancies and still could use much more space compared to the current setup. So what is the crucial advantage of this that led people to set it up in this way?
It's true, but together with larger capacity and guaranteed 2 disk redundancy, you’ll get much slower performance on random write operations compared to RAID10 (the best setup in your case on my opinion) . For backups – it can be ok, for user shares and another workload is questionable. In addition old HDDs have higher failure risk during RAID5/6 rebuilt.
Premise: I agree that a RAID6 array would have made a lot of sense. However, with so many relatively big disks, I strongly suggest avoiding RAID5 due to high chances of second drive failure during a rebuild.
However, RAID6 comes with significant performance penalty especially during a rebuild. For this very reason, in performance critical setups I generally use RAID10, or stripe over mirrored pairs (note: RAID 0+1, or mirror over striped pairs, should be avoided due to lower resilience).
From the specifications you posted above, it seems that the Areca controller does not support RAID10 or other nested RAID modes. If the old sysadmin decided to avoid RAID6 due to its performance and long rebuild time implications, concatenating the individual arrays in a bigger volume group was the simpler approach.
That said, it is not the better performing approach: as the arrays are just concatenated, the low queue depth streaming performance (ie: single process sequential read/write) is going to be bound to that of a single mirrored pair. To avoid that, another software-based RAID0 layer should be put on top of the mirrored pairs - either via plain md
or the newer striped
LVM setup.
We can only speculate, but this setup seems entirely reasonable to me. The pro's I can see include -
Disaster recovery - even though its a proprietary controller its fairly likely a RAID1 disk can be read in another controller if the main controller fails.
Speed - the specs of the controller say nothing about speed for RAID1 or RAID6 - but its probable RAID1 disks will perform significantly faster then other variants for highly scattered workloads as there are no calculations.
Expandability - It may well be he started with a smaller array and just added pairs as time went by. This would be significantly easier then resilvering a disk. (It is a bit strange that all disks are 1tb in this case though)
Higher redundancy and quicker rebuilds in case of certain failures. (Out of curiosity are the disks used in each half of the RAID different models or batches? If so it could show a deliberate thinking of failure modes)
While I would gave put / on the LVM as well, its not unreasonable to keep it separate to allow for easier setup. Also RAID6 was not as common 9 years ago as disks were smaller, so there was less perceived need.
nice diagram! :-)
Yea, that does seem a little odd. Without specifics on the make/model we don't know if this was done due to a limitation of the hardware or something, but if ont I'd be surprised it it was due to anything other than a misunderstanding of the available RAID options or a lack of understanding of how to set it up. Maybe they knew LVM really well and just stuck to what they knew.
I'd usually recommend relying on hardware raid where its available for stability and performance reasons.