rsync exclude according to .gitignore & .hgignore & svn:ignore like --filter=:C
Rsync includes a nifty option --cvs-exclude
to “ignore files in the same way CVS does”, but CVS has been obsolete for years. Is there any way to make it also exclude files which would be ignored by modern version control systems (Git, Mercurial, Subversion)?
For example, I have lots of Maven projects checked out from GitHub. Typically they include a .gitignore
listing at least target
, the default Maven build directory (which may be present at top level or in submodules). Since the contents of these directories are entirely disposable, and they can be far larger than source code, I would like to exclude them when using rsync for backups.
Of course I can explicitly --exclude=target/
but that will accidentally suppress unrelated directories that just happen to be named target
and are not supposed to be ignored.
And I could supply a complete list of absolute paths for all file names and patterns mentioned in any .gitignore
, .hgignore
, or svn:ignore
property on my disk, but this would be a huge list that would have to be produced by some sort of script.
Since rsync has no built-in support for VCS checkouts other than CVS, is there any good trick for feeding it their ignore patterns? Or some kind of callback system whereby a user script can be asked whether a given file/directory should be included or not?
Update: --filter=':- .gitignore'
as suggested by LordJavac seems to work as well for Git as --filter=:C
does for CVS, at least on the examples I have found, though it is unclear if the syntax is an exact match. --filter=':- .hgignore'
does not work very well for Mercurial; e.g. an .hgignore
containing a line like ^target$
(the Mercurial equivalent of Git /target/
) is not recognized by rsync as a regular expression. And nothing seems to work for Subversion, for which you would have to parse .svn/dir-prop-base
for a 1.6 or earlier working copy, and throw up your hands in dismay for a 1.7 or later working copy.
Solution 1:
As mentioned by luksan, you can do this with the --filter
switch to rsync
. I achieved this with --filter=':- .gitignore'
(there's a space before ".gitignore") which tells rsync
to do a directory merge with .gitignore
files and have them exclude per git's rules. You may also want to add your global ignore file, if you have one. To make it easier to use, I created an alias to rsync
which included the filter.
Solution 2:
You can use git ls-files
to build the list of files excluded by the repository's .gitignore
files.
https://git-scm.com/docs/git-ls-files
Options:
-
--exclude-standard
Consider all.gitignore
files. -
-o
Don't ignore unstaged changes. -
-i
Only output ignored files. -
--directory
Only output the directory path if the entire directory is ignored.
The only thing I left to ignore was .git
.
rsync -azP --exclude=.git --exclude=`git -C <SRC> ls-files --exclude-standard -oi --directory` <SRC> <DEST>