Coding Katas for practicing the refactoring of legacy code

I don't know of a site that catalogs them directly, but one strategy that I've used on occasion is this:

  1. Find an old, small, unmaintained open source project on sourceforge
  2. Download it, get it to compile/build/run
  3. Read the documentation, get a feel for the code
  4. Use the techniques in Working Effectively with Legacy Code to get a piece of it under test
  5. Refactor that piece, perhaps fixing bugs and adding features along the way
  6. Repeat steps 4 through 6

When you find a part that was especially challenging, throw away your work and repeat it a couple times to reinforce your skills.

This doesn't just practice refactoring, but other skills like code reading, testing, and dealing with build processes.

The hardest problem is finding a project that you're interested enough in to keep working in. The last one I worked on was a python library for genetic programming, and the current one I'm working on is a IRC library for Java.


I feel like necromancer replying to such an old thread, but there is one thing that would make for a worthy addition - Legacy Code Retreat.

Idea is to have a Code Retreat with legacy code and try to practice the very techniques for dealing with such, but I can't see anything that would ban you from simply using the code prepared and practicing with it by yourself. Just using it for creating a Golden Master makes for an hour of work, and there's a lot more you can do. If your kata usually last around 2 hours, I'd say just by splitting what usually happens on LCR into kata gives you four different things to work on.

There's a GitHub repository by idea's author, J.B. Rainsberger, that contains a simple legacy system that you are to work with, Trivia Game.

From my experience as organizer/participant, folks really liked this and it was illuminating to see what can be a problem in a legacy code and where your refactoring can lead you astray (and how!). Here's yet another account of how it looks like, by Andreas Leidig.


Emily Bache has a github repository with some refactoring katas: Emily Bache's Refactoring Kata Repo. There are variants of KataYahtzee and KataTennis to refactor. Also, she has a variant of the Gilded Rose Kata, which was designed as a refactoring kata.

Also, she has the Racing Car Katas in her repo: Racing Car Kata. The Race Car Katas also include good exercises for refactoring.

Those kata have the code in multiple langauages:

  • C++
  • C#
  • Java
  • Javascript
  • Python
  • Ruby