How to version control a record in a database [closed]
Let's say you have a FOO
table that admins and users can update. Most of the time you can write queries against the FOO table. Happy days.
Then, I would create a FOO_HISTORY
table. This has all the columns of the FOO
table. The primary key is the same as FOO plus a RevisionNumber column. There is a foreign key from FOO_HISTORY
to FOO
. You might also add columns related to the revision such as the UserId and RevisionDate. Populate the RevisionNumbers in an ever-increasing fashion across all the *_HISTORY
tables (i.e. from an Oracle sequence or equivalent). Do not rely on there only being one change in a second (i.e. do not put RevisionDate
into the primary key).
Now, every time you update FOO
, just before you do the update you insert the old values into FOO_HISTORY
. You do this at some fundamental level in your design so that programmers can't accidentally miss this step.
If you want to delete a row from FOO
you have some choices. Either cascade and delete all the history, or perform a logical delete by flagging FOO
as deleted.
This solution is good when you are largely interested in the current values and only occasionally in the history. If you always need the history then you can put effective start and end dates and keep all the records in FOO
itself. Every query then needs to check those dates.
I think you are looking for versioning the content of database records (as StackOverflow does when someone edits a question/answer). A good starting point might be looking at some database model that uses revision tracking.
The best example that comes to mind is MediaWiki, the Wikipedia engine. Compare the database diagram here, particularly the revision table.
Depending on what technologies you're using, you'll have to find some good diff/merge algorithms.
Check this question if it's for .NET.
In the BI world, you could accomplish this by adding a startDate and endDate to the table you want to version. When you insert the first record into the table, the startDate is populated, but the endDate is null. When you insert the second record, you also update the endDate of the first record with the startDate of the second record.
When you want to view the current record, you select the one where endDate is null.
This is sometimes called a type 2 Slowly Changing Dimension. See also TupleVersioning
Upgrade to SQL 2008.
Try using SQL Change Tracking, in SQL 2008. Instead of timestamping and tombstone column hacks, you can use this new feature for tracking changes on data in your database.
MSDN SQL 2008 Change Tracking