Dirty fields in django

Solution 1:

I've found Armin's idea very useful. Here is my variation;

class DirtyFieldsMixin(object):
    def __init__(self, *args, **kwargs):
        super(DirtyFieldsMixin, self).__init__(*args, **kwargs)
        self._original_state = self._as_dict()

    def _as_dict(self):
        return dict([(f.name, getattr(self, f.name)) for f in self._meta.local_fields if not f.rel])

    def get_dirty_fields(self):
        new_state = self._as_dict()
        return dict([(key, value) for key, value in self._original_state.iteritems() if value != new_state[key]])

Edit: I've tested this BTW.

Sorry about the long lines. The difference is (aside from the names) it only caches local non-relation fields. In other words it doesn't cache a parent model's fields if present.

And there's one more thing; you need to reset _original_state dict after saving. But I didn't want to overwrite save() method since most of the times we discard model instances after saving.

def save(self, *args, **kwargs):
    super(Klass, self).save(*args, **kwargs)
    self._original_state = self._as_dict()

Solution 2:

You haven't said very much about your specific use case or needs. In particular, it would be helpful to know what you need to do with the change information (how long do you need to store it?). If you only need to store it for transient purposes, @S.Lott's session solution may be best. If you want a full audit trail of all changes to your objects stored in the DB, try this AuditTrail solution.

UPDATE: The AuditTrail code I linked to above is the closest I've seen to a full solution that would work for your case, though it has some limitations (doesn't work at all for ManyToMany fields). It will store all previous versions of your objects in the DB, so the admin could roll back to any previous version. You'd have to work with it a bit if you want the change to not take effect until approved.

You could also build a custom solution based on something like @Armin Ronacher's DiffingMixin. You'd store the diff dictionary (maybe pickled?) in a table for the admin to review later and apply if desired (you'd need to write the code to take the diff dictionary and apply it to an instance).

Solution 3:

Django is currently sending all columns to the database, even if you just changed one. To change this, some changes in the database system would be necessary. This could be easily implemented on the existing code by adding a set of dirty fields to the model and adding column names to it, each time you __set__ a column value.

If you need that feature, I would suggest you look at the Django ORM, implement it and put a patch into the Django trac. It should be very easy to add that and it would help other users too. When you do that, add a hook that is called each time a column is set.

If you don't want to hack on Django itself, you could copy the dict on object creation and diff it.

Maybe with a mixin like this:

class DiffingMixin(object):

    def __init__(self, *args, **kwargs):
        super(DiffingMixin, self).__init__(*args, **kwargs)
        self._original_state = dict(self.__dict__)

    def get_changed_columns(self):
        missing = object()
        result = {}
        for key, value in self._original_state.iteritems():
            if key != self.__dict__.get(key, missing):
                result[key] = value
        return result

 class MyModel(DiffingMixin, models.Model):
     pass

This code is untested but should work. When you call model.get_changed_columns() you get a dict of all changed values. This of course won't work for mutable objects in columns because the original state is a flat copy of the dict.

Solution 4:

Adding a second answer because a lot has changed since the time this questions was originally posted.

There are a number of apps in the Django world that solve this problem now. You can find a full list of model auditing and history apps on the Django Packages site.

I wrote a blog post comparing a few of these apps. This post is now 4 years old and it's a little dated. The different approaches for solving this problem seem to be the same though.

The approaches:

  1. Store all historical changes in a serialized format (JSON?) in a single table
  2. Store all historical changes in a table mirroring the original for each model
  3. Store all historical changes in the same table as the original model (I don't recommend this)

The django-reversion package still seems to be the most popular solution to this problem. It takes the first approach: serialize changes instead of mirroring tables.

I revived django-simple-history a few years back. It takes the second approach: mirror each table.

So I would recommend using an app to solve this problem. There's a couple of popular ones that work pretty well at this point.

Oh and if you're just looking for dirty field checking and not storing all historical changes, check out FieldTracker from django-model-utils.