RAID-6: better to replace two dead drives at the same time, or one at a time?
We have a 16-drive RAID-6 that has three problem drives. Two are already dead, and the third is giving SMART warnings. (Nevermind how it got in such a bad state.)
Obviously we want to replace the dead drives before the one that is still working, but is it better to:
replace one dead drive, let the RAID rebuild, then replace the other, and let it rebuild again; or
replace both drives at once and let it rebuild both in parallel?
To put it another way, will we get back to a state of redundancy faster by reintroducing one drive or two? Does rebuilding two drives in parallel slow the rebuild process?
In case it matters, the controller is a 3ware 9650SE-16ML.
Solution 1:
!!!!! ONE !!!!!
Do one at a time, seriously dude, don't think of doing this ANY other way ok.
Anything else will test your full system restoration skills.
Solution 2:
Do you have good, recent backups? If not do you think you can get them in reasonable time?
I'd honestly be more concerned about tripping the bad drive offline during a rebuild than anything else - If you're already throwing SMART errors you're more than halfway there.
My suggestion would be to confirm your backups, then rebuild one drive at a time to try to recover to a state where you can replace the one throwing SMART errors (dead drives first, soft-errors last).
If you have no backups it's a crap shoot: Backing up may create enough soft errors to mark the marginal drive as failed, as may trying to do a rebuild.