Why is `zfs list -t snapshot` orders of magnitude slower than `ls .zfs/snapshot`?
Snapshot operations are a function of the number of snapshots you have, RAM, disk performance and drive space. This would be a general ZFS issue, not something unique to the Linux variant.
The better question is: Why you have 1797 snapshots of a zvol? This is definitely more than recommended and makes me wonder what else is happening on the system.
People say "ZFS snapshots are free", but that's not always true.
While ZFS snaps don't have an impact on production performance, the high number you have clearly require disk accesses to enumerate.
Disk access time > RAM access time
, hence the order of magnitude difference.
strace
output. Note the time per syscall and imagine how poorly it would scale with the number of snapshots in your filesystem.
# strace -c ls /ppro/.zfs/snapshot
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 10 read
0.00 0.000000 0 17 write
0.00 0.000000 0 12 open
0.00 0.000000 0 14 close
0.00 0.000000 0 1 stat
0.00 0.000000 0 12 fstat
0.00 0.000000 0 28 mmap
0.00 0.000000 0 16 mprotect
0.00 0.000000 0 3 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 1 rt_sigprocmask
0.00 0.000000 0 2 ioctl
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 fcntl
0.00 0.000000 0 2 getdents
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 1 statfs
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 2 1 futex
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 1 set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 133 2 total
versus
# strace -c zfs list -t snapshot
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.003637 60 61 7 ioctl
0.00 0.000000 0 12 read
0.00 0.000000 0 50 write
0.00 0.000000 0 19 open
0.00 0.000000 0 19 close
0.00 0.000000 0 15 fstat
0.00 0.000000 0 37 mmap
0.00 0.000000 0 19 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 4 brk
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 1 rt_sigprocmask
0.00 0.000000 0 3 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 2 1 futex
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 1 set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00 0.003637 250 9 total
zfs list -t snapshot
always takes many orders of magnitude more time to run than ls .zfs/snapshot
You're also comparing two completely different operations.
zfs list -t snapshot
enumerates all the ZFS snapshots on the system - and provides a lot of information about those snapshots, such as the amount of space used. Run that under strace
to see the system calls made.
ls .zfs/snapshot
is just emitting a simple name list from a directory. There's nothing to do other than read the names - and provide nothing else.