Two Git repositories share common code

So I have a project which has the client parts and the server parts. I would like the client and the server code to be in different Git repositories for clear separation of the code, however, they share some of the files and ideally, the shared files will always be identical in both client and server.

How should I go about this? Is it better to have the third repository for the common code?


Solution 1:

I will not give advice on “How you should go about this”. But I'll explain how to accomplish integrating a third repository with common code into the other repositories: This is a job for git submodules. git submodules allow you to reference a snapshot of a git repository at a specified path:

git submodule add git://example.com/repository.git path

Afterwards create a commit. Now a snapshot of the given repository is referenced at path.

Populating the submodule

  1. Execute git submodule init
  2. Execute git submodule update

Every configured submodule will be updated to match the state it is supposed to look like.

Updating a submodule to a later commit

  1. Change into the directory of the submodule
  2. Perform a git pull / git fetch + git checkout to update the submodule to the desired commit / tag
  3. Create a new commit to update the commit in the .gitmodules file

Take a look at the official manual for further information on submodules.

Solution 2:

If they share common code, I see two sensible options. Separate the common code into its own project, or merge the server and client repositories to make working on them together easier.

Whether separating the common code is worth the extra effort is up to you. Does the common code make sense on its own, or is is just a bunch of functions specific to this product? For example, if you had an SSL or date parsing code in common that would make a good spin off project. Or maybe you've written special code for config file parsing, that can be worked on stand-alone even if nobody but your project will use it. If you're spinning off common code just because the two projects share it, don't bother, it will have no direction of its own. Spinning it off will just be a barrier to development for both the server and client teams.

Whether you should merge the client and server is another consideration. It also comes down to whether it makes sense to consider them as separate products. Are they useful as separate products? Can different versions of the client and server work together, or must they be the same version? Do different people work on the client versus the server? The fact that you want to keep everything together in one super-repository says no.

If you do separate into multiple repositories (client, server, related projects) follow TimWolla's answer.

If you're not sure, merge them all into one repository with server/, client/ and common/ top level directories. If their concerns are entangled, put them together. This will also make it easier to spot and migrate duplicated code. You may work on disentangling them and creating concrete "common" projects, and at that time they should be spun off into their own repositories.

Solution 3:

TL;DR

Just use Git submodules to hold your common code. That's the use case they're intended for. Other options exist, but mostly for repositories on the same filesystem and with little advantage when cloned over a network.

Common Code vs. Identical Files

The proper way to deal with common code is usually through submodules or subtree merging. However, if you have file assets that are (and will remain) identical, then you can leverage symlinks on filesystems that support them. There are at least three downsides to this approach:

  1. Only one repository would have the "real" file. The other repository would only contain a symlink to a file that may or may not exist on a different filesystem.
  2. Windows systems don't have Linux/Unix-style symlinks, so interoperability may be an issue.
  3. Unless you're dealing with really large binary blobs, the space savings of shared links is probably minimal, and not worth the effort.

You can also look into the use of alternates to share objects between repositories on the same filesystem. The manual says (emphasis mine):

You could be using the objects/info/alternates or $GIT_ALTERNATE_OBJECT_DIRECTORIES mechanisms to borrow objects from other object stores. A repository with this kind of incomplete object store is not suitable to be published for use with dumb transports but otherwise is OK as long as objects/info/alternates points at the object stores it borrows from.

This sort of advanced usage is usually used to speed up forking on systems like Atlassian Stash rather than for sharing specific blobs, but the tools are there if you want to juggle chainsaws.