Incorrect cache device after ZFS import
I've recently migrated from an Ubuntu machine to an Arch Linux machine.
I imported the pool using the zpool import -f tank
and it reported my cache drive as faulted, but my storage drives are working fine. Its a raidz2 with 5 drives. The weird thing is, its reporting the wrong drive as the cache. Its listing sde
, when it should be sdg
. Notice that sde
is also listed as a storage device.
❯ zpool status
pool: tank
state: ONLINE
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
scan: scrub canceled on Tue Dec 29 21:16:30 2020
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
sde ONLINE 0 0 0
sdb ONLINE 0 0 0
sda ONLINE 0 0 0
sdd ONLINE 0 0 0
sdf ONLINE 0 0 0
cache
sde FAULTED 0 0 0 corrupted data
My actual cache drive is happily waiting to be used: see /dev/sdg:
~
❯ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
|-sda1 8:1 0 1.8T 0 part
`-sda9 8:9 0 8M 0 part
sdb 8:16 0 1.8T 0 disk
|-sdb1 8:17 0 1.8T 0 part
`-sdb9 8:25 0 8M 0 part
sdc 8:32 0 447.1G 0 disk
|-sdc1 8:33 0 450M 0 part
|-sdc2 8:34 0 100M 0 part
|-sdc3 8:35 0 16M 0 part
|-sdc4 8:36 0 445.7G 0 part
`-sdc5 8:37 0 875M 0 part
sdd 8:48 0 1.8T 0 disk
|-sdd1 8:49 0 1.8T 0 part
`-sdd9 8:57 0 8M 0 part
sde 8:64 0 1.8T 0 disk
|-sde1 8:65 0 1.8T 0 part
`-sde9 8:73 0 8M 0 part
sdf 8:80 0 1.8T 0 disk
|-sdf1 8:81 0 1.8T 0 part
`-sdf9 8:89 0 8M 0 part
sdg 8:96 0 465.8G 0 disk
|-sdg1 8:97 0 465.8G 0 part
`-sdg9 8:105 0 8M 0 part
sr0 11:0 1 1024M 0 rom
nvme0n1 259:0 0 1.8T 0 disk
|-nvme0n1p1 259:1 0 550M 0 part /boot/EFI
`-nvme0n1p2 259:2 0 1.8T 0 part /
I'm not sure how to replace the cache drive with the correct one. The replace command throws an error:
sudo zpool replace tank sde
/dev/sde is in use and contains a unknown filesystem.
I tried adding the actual cache drive back and got this error:
❯ sudo zpool add tank cache /dev/sdg
cannot add to 'tank': one or more vdevs refer to the same device
zdb output doesn't list the cache device
tank:
version: 5000
name: 'tank'
state: 0
txg: 3783078
pool_guid: 3128882764625212484
errata: 0
hostname: 'stephen-desktop'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 3128882764625212484
create_txg: 4
children[0]:
type: 'raidz'
id: 0
guid: 12617640708297166488
nparity: 2
metaslab_array: 256
metaslab_shift: 34
ashift: 12
asize: 10001923440640
is_log: 0
create_txg: 4
com.delphix:vdev_zap_top: 129
children[0]:
type: 'disk'
id: 0
guid: 13646832995608515279
path: '/dev/sde1'
whole_disk: 1
DTL: 344
create_txg: 4
com.delphix:vdev_zap_leaf: 130
children[1]:
type: 'disk'
id: 1
guid: 437662985516969209
path: '/dev/sdb1'
whole_disk: 1
DTL: 343
create_txg: 4
com.delphix:vdev_zap_leaf: 131
children[2]:
type: 'disk'
id: 2
guid: 12577615618022029516
path: '/dev/sda1'
whole_disk: 1
DTL: 368
create_txg: 4
com.delphix:vdev_zap_leaf: 367
children[3]:
type: 'disk'
id: 3
guid: 14049339035002966003
path: '/dev/sdd1'
whole_disk: 1
DTL: 341
create_txg: 4
com.delphix:vdev_zap_leaf: 133
children[4]:
type: 'disk'
id: 4
guid: 2563007804694134101
path: '/dev/sdf1'
whole_disk: 1
DTL: 340
create_txg: 4
com.delphix:vdev_zap_leaf: 134
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
Solution 1:
You could consider importing the pool with /dev/disk/by-id
names instead of the standard SCSI sd*
names. Due to your OS move and inconsistent device enumeration, /dev/sd* names are not deterministic and can possibly change.
Here's an example of how to do this: https://unix.stackexchange.com/q/288599/3416