Can a failed Btrfs drive in RAID-1 be replaced live?

I am trying to decide on a filesystem and would like to know if it is possible to replace a failed drive in btrfs RAID without downtime.

  1. Suppose I create a new btrfs filesystem using the command

    mkfs.btrfs -d raid1 /dev/sdb /dev/sdc
    
  2. Now suppose one day /dev/sdc fails. There are two possibilities: it can fail gradually, showing S.M.A.R.T. errors - in this situation I can add a new device with btrfs device add /dev/sde /mnt; btrfs filesystem balance /mnt and then remove the old one with btrfs device delete /dev/sdc /mnt.

  3. But if it suddenly fails, becoming unreadable... A web search says in this situation I must first unmount the filesystem, mount in degraded mode, add a new device, then remove the missing device.

    umount /mnt
    mount -o degraded /dev/sdb /mnt
    btrfs device add /dev/sdf /mnt 
    btrfs device delete missing /mnt
    

An unmount is obviously a disruptive operation so there would be downtime - any application using the filesystem would get an I/O error. But these kind of "tutorials" on btrfs look outdated, considering btrfs is under heavy development.

Question is: considering current state of btrfs, is it possible to do this online, i.e. without unmounting?

If not, there is a software-only solution that can fulfill this need?


In Linux 3.8, btrfs replace mountpoint old_disk new_disk was added. If you're running a recent kernel, it will provide the functionality you are looking for.


small correction, current syntax is:

btrfs replace start OLDDEV NEWDEV MOUNTPOINT

which then backgrounds.

You can check the status with

btrfs replace status MOUNTPOINT

which will show you a continuously updated status of the replace operation.