zfs checksum errors on Solaris 11 under KVM

Synopsis: libvirt 5.6.0, QEMU 4.1.1, Linux kernel 5.5.10-200, Fedora Server 31.

Solaris 11.4 fresh install (with Solaris 10 branded zones), raw disk on XFS (unfortunately, no possibility to switch to ZFS on Linux and provide a passthrough ZVOL to VM). When I copy a large gzipped file on a ZFS dataset on Solaris VM, zpool get some zfs errors, when I gunzip the file, the gunzipped file becomes corrupted.

Firstly the Solaris VM was hosted on a qcow2 virtual disks, I thought that CoW on CoW is probably the bad idea, so I switched to Raw. Nothing really changed.

Ideas, anyone (I'm acually out of any) ? Solaris 11.4 datasets itself arent't corrupoted. I also successfully run FreeBSD/zfs on a similar setups under KVM (however, using ZVOLs, but still on Linux - no checksum errors there).

Pristine pool:

  pool: oracle
 state: ONLINE
  scan: scrub repaired 0 in 28s with 0 errors on Mon Mar 22 09:58:30 2021

config:

        NAME    STATE      READ WRITE CKSUM
        oracle  ONLINE        0     0     0
          c3d0  ONLINE        0     0     0

errors: No known data errors

Copyig file:

[root@s10-zone ~]# cd /opt/oracle/exchange/
[root@s10-zone exchange]# scp [email protected]:/Backup/oracle/expdp/lcomsys.dmp.gz .
Password: 
lcomsys.dmp.gz       100% |*********************************************************************| 27341 MB  2:23:09

Ran a scrub after the copying was finished:

  pool: oracle
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
        entire pool from backup.
   see: http://support.oracle.com/msg/ZFS-8000-8A
  scan: scrub repaired 6.50K in 5m16s with 3 errors on Tue Mar 23 09:36:34 2021

config:

        NAME    STATE      READ WRITE CKSUM
        oracle  ONLINE        0     0     3
          c3d0  ONLINE        0     0    10

errors: Permanent errors have been detected in the following files:

        /system/zones/s10-zone/root/opt/oracle/exchange/lcomsys.dmp.gz

This is how the solaris virtual disks are attached:

    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/vms/disks/solaris11.img'/>
      <backingStore/>
      <target dev='sda' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/vms/iso/sol-11_4-text-x86.iso'/>
      <backingStore/>
      <target dev='hda' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/vms/disks/solaris10-data.img'/>
      <backingStore/>
      <target dev='hdb' bus='ide'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/vms/disks/solaris11-data.img'/>
      <backingStore/>
      <target dev='hdc' bus='ide'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>

Weird, but, considering the rpool not becoming corrupted, I've changed disk definitions for VM to sata:

    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/vms/disks/solaris10-data.img'/>
      <backingStore/>
      <target dev='sdb' bus='sata'/>
      <address type='drive' controller='1' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/vms/disks/solaris11-data.img'/>
      <backingStore/>
      <target dev='sdc' bus='sata'/>
      <address type='drive' controller='2' bus='0' target='0' unit='0'/>
    </disk>

And the zfs checksum corruption magically stopped.

zfs checksum errors on Solaris 11 under KVM

Related

Recent Posts