How to move some files from one git repo to another (not a clone), preserving history
Our Git repositories started out as parts of a single monster SVN repository where the individual projects each had their own tree like so:
project1/branches
/tags
/trunk
project2/branches
/tags
/trunk
Obviously, it was pretty easy to move files from one to another with svn mv
. But in Git, each project is in its own repository, and today I was asked to move a subdirectory from project2
to project1
. I did something like this:
$ git clone project2
$ cd project2
$ git filter-branch --subdirectory-filter deeply/buried/java/source/directory/A -- --all
$ git remote rm origin # so I don't accidentally overwrite the repo ;-)
$ mkdir -p deeply/buried/different/java/source/directory/B
$ for f in *.java; do
> git mv $f deeply/buried/different/java/source/directory/B
> done
$ git commit -m "moved files to new subdirectory"
$ cd ..
$
$ git clone project1
$ cd project1
$ git remote add p2 ../project2
$ git fetch p2
$ git branch p2 remotes/p2/master
$ git merge p2 # --allow-unrelated-histories for git 2.9+
$ git remote rm p2
$ git push
But that seems pretty convoluted. Is there a better way to do this sort of thing in general? Or have I adopted the right approach?
Note that this involves merging the history into an existing repository, rather than simply creating a new standalone repository from part of another one (as in an earlier question).
Solution 1:
If your history is sane, you can take the commits out as patch and apply them in the new repository:
cd repository
git log --pretty=email --patch-with-stat --reverse --full-index --binary -- path/to/file_or_folder > patch
cd ../another_repository
git am --committer-date-is-author-date < ../repository/patch
Or in one line
git log --pretty=email --patch-with-stat --reverse -- path/to/file_or_folder | (cd /path/to/new_repository && git am --committer-date-is-author-date)
(Taken from Exherbo’s docs)
Solution 2:
Having tried various approaches to move a file or folder from one Git repository to another, the only one which seems to work reliably is outlined below.
It involves cloning the repository you want to move the file or folder from, moving that file or folder to the root, rewriting Git history, cloning the target repository and pulling the file or folder with history directly into this target repository.
Stage One
-
Make a copy of repository A as the following steps make major changes to this copy which you should not push!
git clone --branch <branch> --origin origin --progress \ -v <git repository A url> # eg. git clone --branch master --origin origin --progress \ # -v https://username@giturl/scm/projects/myprojects.git # (assuming myprojects is the repository you want to copy from)
-
cd into it
cd <git repository A directory> # eg. cd /c/Working/GIT/myprojects
-
Delete the link to the original repository to avoid accidentally making any remote changes (eg. by pushing)
git remote rm origin
-
Go through your history and files, removing anything that is not in directory 1. The result is the contents of directory 1 spewed out into to the base of repository A.
git filter-branch --subdirectory-filter <directory> -- --all # eg. git filter-branch --subdirectory-filter subfolder1/subfolder2/FOLDER_TO_KEEP -- --all
-
For single file move only: go through what's left and remove everything except the desired file. (You may need to delete files you don't want with the same name and commit.)
git filter-branch -f --index-filter \ 'git ls-files -s | grep $'\t'FILE_TO_KEEP$ | GIT_INDEX_FILE=$GIT_INDEX_FILE.new \ git update-index --index-info && \ mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE || echo "Nothing to do"' --prune-empty -- --all # eg. FILE_TO_KEEP = pom.xml to keep only the pom.xml file from FOLDER_TO_KEEP
Stage Two
-
Cleanup step
git reset --hard
-
Cleanup step
git gc --aggressive
-
Cleanup step
git prune
You may want to import these files into repository B within a directory not the root:
-
Make that directory
mkdir <base directory> eg. mkdir FOLDER_TO_KEEP
-
Move files into that directory
git mv * <base directory> eg. git mv * FOLDER_TO_KEEP
-
Add files to that directory
git add .
-
Commit your changes and we’re ready to merge these files into the new repository
git commit
Stage Three
-
Make a copy of repository B if you don’t have one already
git clone <git repository B url> # eg. git clone https://username@giturl/scm/projects/FOLDER_TO_KEEP.git
(assuming FOLDER_TO_KEEP is the name of the new repository you are copying to)
-
cd into it
cd <git repository B directory> # eg. cd /c/Working/GIT/FOLDER_TO_KEEP
-
Create a remote connection to repository A as a branch in repository B
git remote add repo-A-branch <git repository A directory> # (repo-A-branch can be anything - it's just an arbitrary name) # eg. git remote add repo-A-branch /c/Working/GIT/myprojects
-
Pull from this branch (containing only the directory you want to move) into repository B.
git pull repo-A-branch master --allow-unrelated-histories
The pull copies both files and history. Note: You can use a merge instead of a pull, but pull works better.
-
Finally, you probably want to clean up a bit by removing the remote connection to repository A
git remote rm repo-A-branch
-
Push and you’re all set.
git push
Solution 3:
Yep, hitting on the --subdirectory-filter
of filter-branch
was key. The fact that you used it essentially proves there's no easier way - you had no choice but to rewrite history, since you wanted to end up with only a (renamed) subset of the files, and this by definition changes the hashes. Since none of the standard commands (e.g. pull
) rewrite history, there's no way you could use them to accomplish this.
You could refine the details, of course - some of your cloning and branching wasn't strictly necessary - but the overall approach is good! It's a shame it's complicated, but of course, the point of git isn't to make it easy to rewrite history.