Zpool degrades when plugging in a drive
In an effort to test what impact adding a ZFS log device would have to a ZFS array, I decided to create a zpool and perform some benchmarks before then plugging in an SSD to act as the ZIL.
Unfortunately, whenever I plug in the SSD after having created the zpool, or unplug the SSD after having created the pool (anything that causes drive letters to change after the pool has been created), and then reboot, my pool will become degraded as can be shown by running sudo zpool status
pool: zpool1
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zpool1 DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
sda ONLINE 0 0 0
1875547483567261808 UNAVAIL 0 0 0 was /dev/sdc1
I suspect the problem stems from the fact that I created the pool using the drive letters like so:
sudo zpool create -f zpool1 mirror /dev/sdb /dev/sdc
Questions
Luckily for me, this is just a test and no risk of losing data, but should this happen in a real-world scenario, what is the best way to recover from this issue? Obviously the drive still exists and is ready to go.
Is there a better way to create the zpool without using the drive letters like /dev/sda
to avoid this problem in future? I notice that the Ubuntu documentation states to create a zpool in the same manner that I did.
Extra Info
- OS: Ubuntu 16.04 Server 4.10
- Installed zfs from installing
zfsutils-linux
package
Solution 1:
After getting help from Dexter_Kane on the Level 1 techs forum, the answer is to use /dev/disk/by-id/...
paths instead when creating the pools.
E.g.
sudo zpool create zpool1 mirror \
/dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N0PKS6S7 \
/dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N7VXZF6H
Converting and Fixing Existing Pools
The good news is that you can "convert" an existing ZFS RAID array to using these labels which prevents this happening in future, and will even resolve your degraded array if this situation has already happened to you.
sudo zpool export [pool name]
sudo zpool import -d /dev/disk/by-id [pool name]
You just have to make sure the pools datasets are not in use. E.g don't execute the commands whilst inside the pool, and ensure they aren't being shared via NFS etc.
After performing the conversion, output of sudo zpool status
should be similar to:
pool: zpool1
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zpool1 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N0PKS6S7 ONLINE 0 0 0
ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N7VXZF6H ONLINE 0 0 0
Testing Performed
I made sure to test that:
- Using by-id paths did prevent the issue happening.
- After writing some data whilst the pool was in a degraded state, I could still read all the files after performing the export/import, and
sudo zpool status
reported no errors.