How do I create a readable diff of two spreadsheets using git diff?

We have a lot of spreadsheets (xls) in our source code repository. These are usually edited with gnumeric or openoffice.org, and are mostly used to populate databases for unit testing with dbUnit. There are no easy ways of doing diffs on xls files that I know of, and this makes merging extremely tedious and error prone.

I've tried to converting the spreadsheets to xml and doing a regular diff, but it really feels like it should be a last resort.

I'd like to perform the diffing (and merging) with git as I do with text files. How would I do this, e.g. when issuing git diff?


Solution 1:

We faced the exact same issue in our co. Our tests output excel workbooks. Binary diff was not an option. So we rolled out our own simple command line tool. Check out the ExcelCompare project. Infact this allows us to automate our tests quite nicely. Patches / Feature requests quite welcome!

Solution 2:

Quick and easy with no external tools, works well as long as the two sheets you are comparing are similar:

  • Create a third spreadsheet
  • Type =if(Sheet1!A1 <> Sheet2!A1, "X", "") in the top left cell (or equivalent: click on the actual cells to automatically have the references inserted into the formula)
  • Ctrl+C (copy), Ctrl+A (select all), Ctrl+V (paste) to fill the sheet.

If the sheets are similar, this spreadsheet will be empty except for a few cells with X in them, highlighting the differences. Unzoom to 40% to quickly see what is different.