Removing files takes too long
Short version: rm -rf mydir
, with mydir
(recursively) containing 2.5 million files, takes about 12 hours on a mostly idle machine.
More information: Most of the files being deleted are hard links to files in other directories (the directory being deleted is actually the oldest backup made by rsnapshot
; the rm
command is actually given by rsnapshot
). So it's mostly directory entries being deleted - the file content itself isn't much; it's in the order of some tens of GB.
I'm far from certain that btrfs
is the culprit. I recall backup was also very slow before I started to use btrfs
, but I'm not certain that the slowness was in the deletion.
The machine is an Intel Core i5 2.67 GHz with 4 GB RAM. It has two SATA disks: one has the OS and some other stuff, and the backup disk is a 1 TB WDC WD1002FAEX-00Z3A0
. The motherboard is an Asus P7P55D.
Edit: The machine is a Debian wheezy with Linux 3.16.3-2~bpo70+1
. This is how the filesystem is mounted:
root@thames:~# mount|grep rsnapshot
/dev/sdb1 on /var/backups/rsnapshot type btrfs (rw,relatime,compress=zlib,space_cache)
Edit: Using rsync -a --delete /some/empty/dir mydir
takes about 6 hours. A significant improvement over rm -rf
, but still too much I think. (Explanation of why rsync
is faster than rm
: "[M]ost filesystems store their directory structures in a btree format, the order [in] which you delete files is ... important. One needs to avoid rebalancing the btree when you perform the unlink.... rsync -a --delete
... does deletions in-order")
Edit: I attached another disk which had 2.2 million files (recursively) in a directory, but on XFS. Here are some comparative results:
On the XFS disk On the BTRFS disk
Cached reads[1] 10 GB/s 10 GB/s
Buffered reads[1] 80 MB/s 115 MB/s
Walk tree[2] 11 minutes 43 minutes
rm -rf mydir[3] 7 minutes 12 hours
[1] With hdparm -T /dev/sdX
and hdparm -t /dev/sdX
.
[2] Time taken to run find mydir -print|wc -l
immediately after boot.
[3] On the XFS disk, this was soon after walking the tree with find
. On the BTRFS disk it is the old measurement (and I don't think it was with the tree cached).
It appears to be a problem with btrfs
.
Solution 1:
Well this is still an Btrfs issue, that's well known that deleting many small files does take quite long time compared to other file systems.
If you dislike it, you can either wait until upstream has fixed it or move on to another file system which does it better.
Your main error though is using an ancient kernel (3.16, yes it was already ancient when you posted) with btrfs. Btrfs is a file system which is still under heavy development, so you should always stay with the latest and greatest kernel version to get in touch with the improvements. If your distribution does not do backports, you can either do that yourself or you are screwed.
Btrfs got many performance improvements in kernel version 3.19 - this is the minimum version you should use in production, your kernel version 3.16 plainly sucks without backports.
Also take in mind that according to Chris Mason he does consider Btrfs stable by now, but not production ready yet.
Solution 2:
I'm a bit late to this party, but here's a trick to very quickly delete extremely large btrfs trees:
- Create a dummy subvolume on the same btrfs filesystem.
- Move the top level directory you want to remove into said subvolume - this operation should be really quick if you're doing it on the same btrfs filesystem, even across subvolumes.
- Destroy the subvolume.
The kernel is going to start reclaiming space in the background, so you won't have the available space quite immediately, but the process should be way faster than doing any sort of user-land deletion.