Fork and synchronize Google Code Subversion repository into GitHub

How can I fork and keep in sync with an Google Code Subversion repository that I don't have write access to, into a GitHub repository?

I want to be able to develop my own features in my Git repository, but I also want to synchronise against the Google Code Subversion repository. To fetch fixes from Google Code project side.

I know about git-svn and used it before to up- and downstream to an Subversion repository I had full control over. But I don't know how keep in sync with a Google Code Subversion repository.


Solution 1:

The remote branch from git-svn is pretty much the same as a regular Git remote. So in your local repository you can have your git-svn clone and push changes out to GitHub. Git doesn't care. If you create your git-svn clone and push the exact same changes out to GitHub, you'll have an unofficial mirror of the Google Code repository. The rest is vanilla Git.

git svn clone http://example.googlecode.com/svn -s
git remote add origin [email protected]:example/example.git
git push origin master

Now that you have this, occasionally you will have to synchronise the Subversion repository with Git. It'll look something like:

git svn rebase
git push

In gitk or whatever, this would look something like this:

o [master][remotes/trunk][remotes/origin/master]
|
o
|
o

And when you run git svn rebase, you would have this:

o [master][remotes/trunk]
|
o
|
o [remotes/origin/master]
|
o
|
o

So now running git push would push those commits out to GitHub, the [remotes/origin/master] branch there. And you'd be back to the scenario in the first ASCII art diagram.

The problem now is, how do you work your changes into the mix? The idea is, you don't ever commit onto the same branch that you are git-svn-rebase-ing and git-pushing. You need a separate branch for your changes. Otherwise, you would end up rebasing your changes on top of the Subversion ones, which could upset anyone who clones your Git repository. Follow me? OK, so you create a branch, let's call it "features". And you make a commit and push it out to GitHub to the features branch. Your gitk would look something like this:

o [features][remotes/origin/features]
|
o
|
o [master][remotes/trunk][remotes/origin/master]
|
o

Here you've got your features branch a couple of commits ahead of the Google Code branch, right? So what happens when you want to incorporate new stuff from Google Code? You'd run git svn rebase first and get this:

                           o [features][remotes/origin/features]
[master][remotes/trunk] o  |
                        |  o
                        o /
                        |/
                        o[remotes/origin/master]
                        |
                        o

If you git push master out, you can imagine the [remotes/origin/master] being at the same point as master. But your feature branch doesn't have the changes. Your choices now are to merge master into features, or rebase features. A merge would look like this

git checkout features
git merge master 

            o [features]
           /|
          / o [remotes/origin/features]
[master] o  |
         |  o
         o /
         |/
         o
         |
         o

Then you push features out to GitHub. I've left off the remotes for master to save space, they'd be at the same point as [master].

The rebase approach is slightly more evil - you'd have to push with --force as your push would not be a fast-forward merge (you'd pull the features branch from under someone who had cloned it). It's not really considered OK to do this, but nobody can stop you if you are determined. It does make some things easier too, such as when patches get accepted upstream in slightly reworked form. It'd save having to mess about with conflicts, you can just rebase --skip the upstreamed patches. Anyway, a rebase would be like this:

git rebase master features

         o [features]
         |
         o
         |  o [remotes/origin/features]
[master] o  |
         |  o
         o /
         |/
         o
         |
         o

And then you would have to git push --force that. You can see why you need to force it, the history has a big old schism from the [remotes/origin/features] to the new current post-rebase [features].

This all works, but it is a lot of effort. If you are going to be a regular contributor, the best bet would be to work like this for a while, send some patches upstream and see if you can get commit access to Subversion. Failing that, perhaps don't push your changes out to GitHub. Keep them local and try and get them accepted upstream anyway.

Solution 2:

svn2github service

The website http://svn2github.com/ provides a service to fork any publicly-accessible SVN repository onto Github (at https://github.com/svn2github/projectname). I tried it; upon pressing "Make a mirror" it apparently did nothing for a few seconds and displayed the message "error", but it actually worked. The new repository was in fact created, containing the code from the SVN repo.

You would then fork the repository it creates, and work on your own fork. You would then submit your changes to the upstream project using their bugtracker.

Looking at existing repositories under the service's Github user (e.g. "svn2github pushed to master at svn2github/haxe 5 hours ago"), it does seem to regularly pull in changes from the SVN repository. There's no information on who runs the service on the website, so I wouldn't bet on it continuing to run indefinitely, but it works for now (and if it ever goes down, you can still manually update your fork).

Launchpad

If you're not set on using Git and Github, another alternative is to use Launchpad.net. Launchpad can automatically import SVN (also CVS) repositories into a personal bzr branch. To do this, create a Launchpad project, then go to the new import page, select Subversion and enter the URL (e.g. http://projectname.googlecode.com/svn/trunk/). Depending on the project size, the initial import can take up to a few hours. Subsequent imports will run periodically.

For more documentation, see VCS Imports on Launchpad Help.