Subversion Obliterate feature

I was just thinking of writing a shell script to implement the obliterate functionality in an easy to do way (externally, using the suggested way, but automated).

Here's what I had in mind:

On the client

  1. svn list -R > file-list.
  2. filter file-list in several ways like grep to create a file "files-to-delete", something like a set of grep XXX file-list>>files-to-delete.
  3. transfer files-to-delete to the server using scp.

On the server

  1. Dump the repository svnadmin dump /path/to/repos > repos-dumpfile, this can be kept as a backup too.
  2. Filter the dump file, for each word in "files-to-delete", do: cat repos-dumpfile | svndumpfilter exclude $file > new-dumpfile
  3. Create a new repository and load the new file to it svnadmin create new-name; svnadmin load new-name < new-dumpfile

Would this work? How can it fail? Any other ideas?


Yes, that script would work. But usually you don't obliterate that many files. Usually obliterate is only needed if you commit confidential information accidentally.

Are you sure you want to use obliterate for so many files?


I think cat new-dumpfile | svndumpfilter exclude $file > new-dumpfile is a dangerous example. new-dumpfile will not be completely processed and it's contents will be probably lost, no?

From the comments below: the new-dumpfile will surely be lost, because the shell will clobber (truncate to zero length) it even before starting up the command.


I had a similar but slightly more complex requirement. Several hundred revisions in the past, some very large (>1GB) sample data files were committed to the repository. They were then moved around and eventually deleted from HEAD. However they were still in revision history, making the repository cumbersomely large. I could not use svn list -R, since the files no longer appeared in the working copy.

However, svn list can be given a revision argument. I wasn't sure exactly when the big files had been checked in, but I knew it was sometime after revision 2000. I also had a list of file names. So I used a simple loop and uniq to generate my files-to-delete:

cd $working_copy
for rev in {2000..2437}; do
    svn ls -R -r$rev | grep -f ~/tmp/big-file-names >> ~/tmp/file-paths;
done
cat ~/tmp/file-paths | sort | uniq > ~/tmp/files-to-delete
cd ~/tmp
# You should inspect "files-to-delete" to see if it looks reasonable!
cat dumpfile | svndumpfilter exclude `cat files-to-delete` > dumpfile.new