Does btrfs have an efficient way to compare snapshots?

While diffing mounted snapshots would work, it sounds like it could be horribly slow in many cases.

Is there btrfs specific functionality for diffing snapshots? (I was unable to find any in the docs)


Solution 1:

btrfs send, which appeared in Linux 3.6 (2012), "generates a stream of changes between two subvolume snapshots." You can use it just to produce a fast metadata comparison by adding the --no-data flag.

btrfs send --no-data -p /snapshots/parent /snapshots/child

Normally, you would drop the --no-data flag and pipe the output into btrfs receive, to do incremental backups. For example, if /snapshots/parent already exists at /backup/snapshots/parent, btrfs send would stream only those changes to the /backup filesystem:

btrfs send -p /snapshots/parent /snapshots/child | btrfs receive /backup/snapshots

Solution 2:

I'm running Debian stable which does did not have btrfs send, so I looked to a solution using btrfs subvolume find-new.

Update: btrfs send was added in Linux 3.6, which was released in 2012 and included in Debian stable by 2015.

If you have snapshot1 and snapshot2 and you want to know what changed in the later one, snapshot 2, since snapshot1 was made you can use the script below which provides

btrfs-diff oldsnapshot/ newsnapshot/

which will list all files changed in newsnapshot/ since oldsnapshot/.

#!/bin/bash
usage() { echo $@ >2; echo "Usage: $0 <older-snapshot> <newer-snapshot>" >2; exit 1; }

[ $# -eq 2 ] || usage "Incorrect invocation";
SNAPSHOT_OLD=$1;
SNAPSHOT_NEW=$2;

[ -d $SNAPSHOT_OLD ] || usage "$SNAPSHOT_OLD does not exist";
[ -d $SNAPSHOT_NEW ] || usage "$SNAPSHOT_NEW does not exist";

OLD_TRANSID=`btrfs subvolume find-new "$SNAPSHOT_OLD" 9999999`
OLD_TRANSID=${OLD_TRANSID#transid marker was }
[ -n "$OLD_TRANSID" -a "$OLD_TRANSID" -gt 0 ] || usage "Failed to find generation for $SNAPSHOT_NEW"

btrfs subvolume find-new "$SNAPSHOT_NEW" $OLD_TRANSID | sed '$d' | cut -f17- -d' ' | sort | uniq

To explain: btrfs subvolume find-new finds files changed after a particular 'generation' of snapshot. It also reports the current generation number.

Caveats

e.g. take the daily snapshot of a subvolume case:

mkdir test && cd test
btrfs subvolume create live
date >live/foo1
date >live/bar1
btrfs subvolume snapshot live/ snap1
date >live/foo2  # new file
date >>live/bar1 # modify file
rm live/foo1     # delete file
btrfs subvolume snapshot live/ snap2
date >live/foo3  # new file
mv live/bar{1,2} # rename file
rm live/foo2     # delete file

What changed between snap1 and snap2?

$ btrfs-diff snap1/ snap2/
bar1
foo2

So we can see the new file, see the modified file, but the deletion is not reported. This is because the command reports on files that exist, not ones that now don't.

What changed between snap2 and the live subvolume?

$ btrfs-diff snap2/ live/
foo3

the renamed file is not reported. Its data has not changed.

Now what if we add data to the renamed file

date >>live/bar2
btrfs-diff snap2/ live/
bar2
foo3

OK, makes sense. But let's make a new file

date >live/lala
btrfs-diff snap2/ live/
bar2
foo3

eh! where's lala?. If you add another file, lala appears. So this behaviour is a bit odd. Which is probably why the wiki says:

The find-new approach has some serious limitations and thus is not really usable for something like send/receive.

However, the oddness comes when you compare a live subvolume against a previous state, not when you're comparing (read-only) snapshots. So this could still be useful unless you want to also identify deleted files.

Solution 3:

This is supported by the snapshot convenience tool snapper.

sudo snapper -c config diff 445..446

Of course this requires you to be using snapper for your snapshots.

This snapshot ids can be found using snapper list -a. Unfortunately at the time of writing snapper did not support list snapshots for a single config, though these numbers can be found from subvolume names.

Solution 4:

Current solution:

btrfs send --no-data  -p SHAPSHOT_OLD  SHAPSHOT_NEW  |  btrfs receive --dump  |  grep ^update_extent