rsync and include / exclude. How hard can it be?
I'm trying to recursively copy a directory / file structure from one directory to another, keeping only html files. Should be a simple case of include / exclude shouldn't it?
I just want to print out the files first. When I get that right, I'll copy them.
rsync -a --list-only -v SOURCEDIR --exclude='.*' --include='**/*.html'
Gives me all the files.
rsync -a --list-only -v SOURCEDIR --include='**/*.html' --exclude='*'
and
rsync -a --list-only -v SOURCEDIR --include='*.html' --exclude='*'
rsync -a --list-only -v SOURCEDIR --include=*.html --exclude=*
Give me no files.
rsync -a --list-only -v SOURCEDIR --include='*.html' --exclude='*.*'
Looks like it gives me the whole directory structure and only html files. But I don't want empty directories.
Help!
On Mac OS 10.6
Solution 1:
Have you considered using find to do your hard work?
Something along the lines of
find ./ -name "*.html" -exec rsync -R {} /target/base/directory/ \;
will recreate the directory tree of ./ in which html files are found, and build the same under /target/base/directory
Solution 2:
Rsync can be confusing about selective copies like this. I use the following to do the task that you're asking for:
rsync -avP \
--filter='+ */' \
--filter='+ **/*.html' \
--filter='- *' \
--prune-empty-dirs \
--delete \
/source/ \
/dest/
Basically you need to include all directories in the search, then add all *.html
files to the list, the exclude all other files.
The --prune-empty-dirs
option is handy to use as it excludes any directory that doesn't have a *.html
file.