RAID1: Which disk will be mirrored?
How does a RAID1 system determine which disk to use as the source and which disk to use as the destination when mirroring?
Assume for instance the following scenario: A RAID1 array is created with two disks A and B. A is replaced by disk C, which is added to the array. Files are beeing modified as time goes by. Now B is removed and A is reinserted.
Will the RAID1 system realize that A and C are out of sync? And that C is more up-to-date than A? And if not, is there a safe way to avoid the mirroring process to start immediately when disk A is inserted?
EDIT: I should clearify that in my scenario I assume that A had not failed when it was removed, and so, as far as I understand, neigther A nor C is "dirty" when the RAID1 system must decide which way to mirror between them. (And I assumed no bitmaps, but I understand this may be relevant.)
What happens is that the two disks are written to in tandem, and read from interchangeably. It is a multi-master system, and from this derives the read performance increase arising from mirroring.
When one volume becomes bad, the array is said to be degraded
. You then add a new disk; the RAID controller knows this drive does not contain its part of the volume (it is dirty
), and the array begins copying the data it is supposed to store onto it from elsewhere in the array (this process is called rebuilding
); for RAID 1, there is only one other place to copy it from. Once rebuilding is done, the new volume is clean
, and the system is fault-tolerant and multi-master again (normal
).
If the other original disk is removed or fails before rebuilding is done, there will be data loss (if removed but not dead, this is trivially recoverable). However, if it is removed after rebuilding is done and a new disk is added, the array will go through the exact same process of degraded to rebuilding to normal.
This is slightly simplified, though it represents the vast majority of cases where disks fail or are otherwise removed and added. It is also usually possible for parts of a volume to be flagged dirty, for instance.
You are talking about a software mirror using mdadm
. You did not write whether you use a bitmap or not. I am assuming you do use a bitmap (in the other case a full re-mirror will start from the first block whenever a disk is lost/reconnected/whatever).
In the case of a bitmap these bitmaps will be stored per disk - either as internal bitmap on the mirror-disk itself, or on a external bitmap (if you specified one) - again this is per disk.
Now this is also the answer to your question: In the beginning, all bits are marked "dirty" - i.e. need to be re synced. Each bit represents a block on the physical disk. So the status of these bitmaps matters.
You see the status of these bitmaps with cat /proc/mdstat
.
Be aware that the creation of bitmaps is not a standard-operation if you create mirrors. You can change that afterwards with mdadm --grow
.