Mercurial Remove History
Is there a way in mercurial to remove old changesets from a database? I have a repository that is 60GB and that makes it pretty painful to do a clone. I would like to trim off everything before a certain date and put the huge database away to collect dust.
Solution 1:
There is no simple / recommended way of doing this directly to an existing repository.
You can however "convert" your mercurial repo to a new mercurial repo and choose a revision from where to include the history onwards via the convert.hg.startrev option
hg convert --config convert.hg.startrev=1234 <source-repository> <new-repository-name>
The new repo will contain everything from the original repo minus the history previous to the starting revision.
Caveat: The new repo will have completely new changeset IDs, i.e. it is in no way related to the original repo. After creating the new repo every developer has to clone the new repo and delete their clones from the original repo.
I used this to cleanup old repos used internally within our company - combined with the --filemap option to remove unwanted files too.
Solution 2:
You can do it, but in doing so you invalidate all the clones out there, so it's generally not wise to do unless you're working entirely alone.
Every changeset in mercurial is uniquely identified by a hashcode, which is a combination of (among other things) the source code changes, metadata, and the hashes of its one or two parents. Those parents need to exist in the repo all the way back to the start of the project. (Not having that restriction would be having shallow-clones, which aren't available (yet)).
If you're okay with changing the hashes of the newer changesets (which again breaks all the clones out there in the wild) you can do so with the commands;
hg export -o 'changeset-%r.patch' 400:tip # changesets 400 through the end for example
cd /elsewhere
hg init newrepo
cd newrepo
hg import /path/to/the/patches/*.patch
You'll probably have to do a little work to handle merge changesets, but that's the general idea.
One could also do it using hg convert
with type hg
as both the source and the destination types, and using a splicemap
, but that's probably more involved yet.
The larger question is, how do you type up 60GB of source code, or were you adding generated files against all advice. :)