Using rsync to only delete extraneous files

What's the best way of comparing two directory structures and deleting extraneous files and directories in the target location?

I have a small web photo gallery app that I'm developing. Users add & remove images using FTP. The web gallery software I've written creates new thumbnails on the fly, but it doesn't deal with deletions. What I would like to do, is schedule a command/bash script to take care of this at predefined intervals.

Original images are stored in /home/gallery/images/ and are organised in albums, using subdirectories. The thumbnails are cached in /home/gallery/thumbs/, using the same directory structure and filenames as the images directory.

I've tried using the following to achieve this:

rsync  -r --delete --ignore-existing /home/gallery/images /home/gallery/thumbs

which would work fine if all the thumbnails have already been cached, but there is no guarantee that this would be the case, when this happens, the thumb directory has original full size images copied to it.

How can I best achieve what I'm trying to do?


I don't think rsync is the best approach for this. I would use a bash one-liner like the following:

$ cd /home/gallery/thumbs && find . -type f | while read file;do if [ ! -f "../images/$file" ];then echo "$file";fi;done

If this one-liner produces the right list of files, you can then modify it to run an rm command instead of an echo command.


You need --existing too:

rsync -r --delete --existing --ignore-existing /home/gallery/images /home/gallery/thumbs

From the manpage:

  --existing, --ignore-non-existing
          This tells rsync to skip creating files (including  directories)
          that  do  not  exist  yet on the destination.  If this option is
          combined with the --ignore-existing option,  no  files  will  be
          updated  (which  can  be  useful if all you want to do is delete
          extraneous files).