Why is `zfs list -t snapshot` orders of magnitude slower than `ls .zfs/snapshot`?

Snapshot operations are a function of the number of snapshots you have, RAM, disk performance and drive space. This would be a general ZFS issue, not something unique to the Linux variant.

The better question is: Why you have 1797 snapshots of a zvol? This is definitely more than recommended and makes me wonder what else is happening on the system.

People say "ZFS snapshots are free", but that's not always true.

While ZFS snaps don't have an impact on production performance, the high number you have clearly require disk accesses to enumerate.

Disk access time > RAM access time, hence the order of magnitude difference.


strace output. Note the time per syscall and imagine how poorly it would scale with the number of snapshots in your filesystem.

# strace -c ls /ppro/.zfs/snapshot

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0        10           read
  0.00    0.000000           0        17           write
  0.00    0.000000           0        12           open
  0.00    0.000000           0        14           close
  0.00    0.000000           0         1           stat
  0.00    0.000000           0        12           fstat
  0.00    0.000000           0        28           mmap
  0.00    0.000000           0        16           mprotect
  0.00    0.000000           0         3           munmap
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         2           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         2           ioctl
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           fcntl
  0.00    0.000000           0         2           getdents
  0.00    0.000000           0         1           getrlimit
  0.00    0.000000           0         1           statfs
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                   133         2 total

versus

# strace -c zfs list -t snapshot

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.003637          60        61         7 ioctl
  0.00    0.000000           0        12           read
  0.00    0.000000           0        50           write
  0.00    0.000000           0        19           open
  0.00    0.000000           0        19           close
  0.00    0.000000           0        15           fstat
  0.00    0.000000           0        37           mmap
  0.00    0.000000           0        19           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         4           brk
  0.00    0.000000           0         2           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         3         1 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           getrlimit
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.003637                   250         9 total

zfs list -t snapshot always takes many orders of magnitude more time to run than ls .zfs/snapshot

You're also comparing two completely different operations.

zfs list -t snapshot enumerates all the ZFS snapshots on the system - and provides a lot of information about those snapshots, such as the amount of space used. Run that under strace to see the system calls made.

ls .zfs/snapshot is just emitting a simple name list from a directory. There's nothing to do other than read the names - and provide nothing else.