Using rsync to only delete extraneous files
What's the best way of comparing two directory structures and deleting extraneous files and directories in the target location?
I have a small web photo gallery app that I'm developing. Users add & remove images using FTP. The web gallery software I've written creates new thumbnails on the fly, but it doesn't deal with deletions. What I would like to do, is schedule a command/bash script to take care of this at predefined intervals.
Original images are stored in /home/gallery/images/
and are organised in albums, using subdirectories. The thumbnails are cached in /home/gallery/thumbs/
, using the same directory structure and filenames as the images directory.
I've tried using the following to achieve this:
rsync -r --delete --ignore-existing /home/gallery/images /home/gallery/thumbs
which would work fine if all the thumbnails have already been cached, but there is no guarantee that this would be the case, when this happens, the thumb directory has original full size images copied to it.
How can I best achieve what I'm trying to do?
I don't think rsync
is the best approach for this. I would use a bash one-liner like the following:
$ cd /home/gallery/thumbs && find . -type f | while read file;do if [ ! -f "../images/$file" ];then echo "$file";fi;done
If this one-liner produces the right list of files, you can then modify it to run an rm
command instead of an echo
command.
You need --existing
too:
rsync -r --delete --existing --ignore-existing /home/gallery/images /home/gallery/thumbs
From the manpage:
--existing, --ignore-non-existing This tells rsync to skip creating files (including directories) that do not exist yet on the destination. If this option is combined with the --ignore-existing option, no files will be updated (which can be useful if all you want to do is delete extraneous files).