zpool fails to import raidz3 pool despite sufficient replicas available

Some users were having issues connecting to this server's share on the pool, while other's who were already on seemed to be fine. After arranging a reboot the pool failed to import once the system booted.

During the reboot I noticed a drive faulted during POST, indicated by an orange light on the bezel, and below in zpool import.

The pool has enough devices to be brought online, but it won't successfully import.

$ zpool import
   pool: darkpool
     id: 5743344949875332602
  state: DEGRADED
 status: One or more devices contains corrupted data.
 action: The pool can be imported despite missing or damaged devices.  The
    fault tolerance of the pool may be compromised if imported.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
 config:

    darkpool                      DEGRADED
      raidz3-0                    DEGRADED
        wwn-0x5000c5008581aafb    ONLINE
        wwn-0x5000c5008581b61b    ONLINE
        wwn-0x5000c5008581b79f    ONLINE
        wwn-0x5000c5008581b933    ONLINE
        wwn-0x5000c5008581b953    ONLINE
        wwn-0x5000c5008581bdf7    ONLINE
        wwn-0x5000c50085825ec7    ONLINE
        wwn-0x5000c5008581cc03    ONLINE
        wwn-0x5000c5008581e423    UNAVAIL
        wwn-0x5000c5008581fd3f    ONLINE
        wwn-0x5000c50085820b93    ONLINE
        wwn-0x5000c500858211b3    ONLINE
        wwn-0x5000cca267ab0de4    ONLINE
        spare-13                  DEGRADED
          11992420879588183985    FAULTED  corrupted data
          wwn-0x5000c500858252ef  ONLINE
    spares
      wwn-0x5000c500858252ef

$ zpool status
no pools available

$ zpool import darkpool
cannot import 'darkpool': I/O error
    Destroy and re-create the pool from
    a backup source.

$ zpool import -f darkpool
cannot import 'darkpool': I/O error
    Destroy and re-create the pool from
    a backup source.

$ zpool import -fFn darkpool

$ zpool import -F darkpool
cannot import 'darkpool': I/O error
    Destroy and re-create the pool from
    a backup source.

$ zpool import -fFX darkpool
cannot import 'darkpool': I/O error
    Destroy and re-create the pool from
    a backup source.

Has anyone seen something like this before? I'm not sure what to try before destroying the pool and restoring from a backup (I'd like to avoid this since it will take so long).

It looks like the backups started to fail a couple of weeks ago. Is there any way to know if having the faulted drive serviced would make the pool happy?

The system is Ubuntu 18.04.2 LTS with zfsutils-linux_0.7.5-1ubuntu16.7_amd64.


Solution 1:

I wound up signing up for LinkedIn Premium so I could message a ZFS developer (who was actually kind enough to respond!). He suggested I move the pool to a system with ZFS 0.8, a version which his relevant commits on Github were included in Ububtu 19.10, among others distros.

In read-only mode, we were able to load the pool by disabling the option spa_load_verify_metadata. This also skips the scan of the pool so you don't have to wait minutes or hours depending on the size of your pool.

Once the pool was loaded I started a backup of everything to a different server, with plans to destroy the pool and server (too many on-site trips from Dell, replacing CPUs, memory, the mobo, etc...), and start fresh with a new system.


Toggling the Option (Ubuntu 19.10):

$ cat /sys/module/zfs/parameters/spa_load_verify_metadata
1
$ echo 0 >/sys/module/zfs/parameters/spa_load_verify_metadata
$ cat /sys/module/zfs/parameters/spa_load_verify_metadata
0

Loading the Pool

zpool import -o readonly=on darkpool -f

The flag will reset after a reboot, so the pool won't load during the boot process. But really you want to copy the data and stop using the pool anyway.