rsync copy over only certain types of files using include option
I use the following bash script to copy only files of certain extension(in this case *.sh), however it still copies over all the files. what's wrong?
from=$1 to=$2 rsync -zarv --include="*.sh" $from $to
Solution 1:
I think --include
is used to include a subset of files that are otherwise excluded by --exclude
, rather than including only those files.
In other words: you have to think about include meaning don't exclude.
Try instead:
rsync -zarv --include "*/" --exclude="*" --include="*.sh" "$from" "$to"
For rsync version 3.0.6 or higher, the order needs to be modified as follows (see comments):
rsync -zarv --include="*/" --include="*.sh" --exclude="*" "$from" "$to"
Adding the -m
flag will avoid creating empty directory structures in the destination. Tested in version 3.1.2.
So if we only want *.sh files we have to exclude all files --exclude="*"
, include all directories --include="*/"
and include all *.sh files --include="*.sh"
.
You can find some good examples in the section Include/Exclude Pattern Rules of the man page
Solution 2:
The answer by @chepner will copy all the sub-directories whether it contains files or not. If you need to exclude the sub-directories that don't contain the file and still retain the directory structure, use
rsync -zarv --prune-empty-dirs --include "*/" --include="*.sh" --exclude="*" "$from" "$to"
Solution 3:
Here's the important part from the man page:
As the list of files/directories to transfer is built, rsync checks each name to be transferred against the list of include/exclude patterns in turn, and the first matching pattern is acted on: if it is an exclude pattern, then that file is skipped; if it is an include pattern then that filename is not skipped; if no matching pattern is found, then the filename is not skipped.
To summarize:
- Not matching any pattern means a file will be copied!
- The algorithm quits once any pattern matches
Also, something ending with a slash is matching directories (like find -type d
would).
Let's pull apart this answer from above.
rsync -zarv --prune-empty-dirs --include "*/" --include="*.sh" --exclude="*" "$from" "$to"
- Don't skip any directories
- Don't skip any
.sh
files - Skip everything
- (Implicitly, don't skip anything, but the rule above prevents the default rule from ever happening.)
Finally, the --prune-empty-directories
keeps the first rule from making empty directories all over the place.
Solution 4:
One more addition: if you need to sync files by its extensions in one dir only (without of recursion) you should use a construction like this:
rsync -auzv --include './' --include '*.ext' --exclude '*' /source/dir/ /destination/dir/
Pay your attention to the dot in the first --include
. --no-r
does not work in this construction.
EDIT:
Thanks to gbyte.co for the valuable comment!
EDIT:
The -uzv
flags are not related to this question directly, but I included them because I use them usually.
Solution 5:
Wrote this handy function and put in my bash scripts or ~/.bash_aliases
. Tested sync'ing locally on Linux with bash and awk
installed. It works
selrsync(){
# selective rsync to sync only certain filetypes;
# based on: https://stackoverflow.com/a/11111793/588867
# Example: selrsync 'tsv,csv' ./source ./target --dry-run
types="$1"; shift; #accepts comma separated list of types. Must be the first argument.
includes=$(echo $types| awk -F',' \
'BEGIN{OFS=" ";}
{
for (i = 1; i <= NF; i++ ) { if (length($i) > 0) $i="--include=*."$i; } print
}')
restargs="$@"
echo Command: rsync -avz --prune-empty-dirs --include="*/" $includes --exclude="*" "$restargs"
eval rsync -avz --prune-empty-dirs --include="*/" "$includes" --exclude="*" $restargs
}
Advantages:
short handy and extensible when one wants to add more arguments (i.e. --dry-run
).
Example:
selrsync 'tsv,csv' ./source ./target --dry-run