Is it possible to use a natural key for a GenericForeignKey in Django?
TL;DR - Currently there is no sane way of doing so, short of creating a custom Serializer
/ Deserializer
pair.
The problem with models that have generic relations is that Django doesn't see target
as a field at all, only target_content_type
and target_object_id
, and it tries to serialize and deserialize them individually.
The classes responsible for serializing and deserializing Django models are in the modules django.core.serializers.base
and django.core.serializers.python
. All the others (xml
, json
and yaml
) extend either of them (and python
extends base
). The field serialization is done like this (irrelevant lines ommited):
for obj in queryset:
for field in concrete_model._meta.local_fields:
if field.rel is None:
self.handle_field(obj, field)
else:
self.handle_fk_field(obj, field)
Here's the first complication: the foreign key to ContentType
is handled ok, with natural keys as we expected. But the PositiveIntegerField
is handled by handle_field
, that is implemented like this:
def handle_field(self, obj, field):
value = field._get_val_from_obj(obj)
# Protected types (i.e., primitives like None, numbers, dates,
# and Decimals) are passed through as is. All other values are
# converted to string first.
if is_protected_type(value):
self._current[field.name] = value
else:
self._current[field.name] = field.value_to_string(obj)
i.e. the only possibility for customization here (subclassing PositiveIntegerField
and defining a custom value_to_string
) will have no effect, since the serializer won't call it. Changing the data type of target_object_id
to something else than a integer will probably break many other stuff, so it's not an option.
We could define our custom handle_field
to emit natural keys in this case, but then comes the second complication: the deserialization is done like this:
for (field_name, field_value) in six.iteritems(d["fields"]):
field = Model._meta.get_field(field_name)
...
data[field.name] = field.to_python(field_value)
Even if we customized the to_python
method, it acts on the field_value
alone, out of the context of the object. It's not a problem when using integers, since it will be interpreted as the model's primary key no matter what model it is. But to deserialize a natural key, first we need to know which model that key belongs to, and that information isn't available unless we got a reference to the object (and the target_content_type
field had already been deserialized).
As you can see, it's not an impossible task - supporting natural keys in generic relations - but to accomplish that a lot of things would need to be changed in the serialization and deserialization code. The steps necessary, then (if anyone feels up to the task) are:
- Create a custom
Field
extendingPositiveIntegerField
, with methods to encode/decode an object - calling the referenced models'natural_key
andget_by_natural_key
; - Override the serializer's
handle_field
to call the encoder if present; - Implement a custom deserializer that: 1) imposes some order in the fields, ensuring the content type is deserialized before the natural key; 2) calls the decoder, passing not only the
field_value
but also a reference to the decodedContentType
.
I've written a custom Serializer and Deserializer which supports GenericFK's. Checked it briefly and it seems to do the job.
This is what I came up with:
import json
from django.contrib.contenttypes.generic import GenericForeignKey
from django.utils import six
from django.core.serializers.json import Serializer as JSONSerializer
from django.core.serializers.python import Deserializer as \
PythonDeserializer, _get_model
from django.core.serializers.base import DeserializationError
import sys
class Serializer(JSONSerializer):
def get_dump_object(self, obj):
dumped_object = super(CustomJSONSerializer, self).get_dump_object(obj)
if self.use_natural_keys and hasattr(obj, 'natural_key'):
dumped_object['pk'] = obj.natural_key()
# Check if there are any generic fk's in this obj
# and add a natural key to it which will be deserialized by a matching Deserializer.
for virtual_field in obj._meta.virtual_fields:
if type(virtual_field) == GenericForeignKey:
content_object = getattr(obj, virtual_field.name)
dumped_object['fields'][virtual_field.name + '_natural_key'] = content_object.natural_key()
return dumped_object
def Deserializer(stream_or_string, **options):
"""
Deserialize a stream or string of JSON data.
"""
if not isinstance(stream_or_string, (bytes, six.string_types)):
stream_or_string = stream_or_string.read()
if isinstance(stream_or_string, bytes):
stream_or_string = stream_or_string.decode('utf-8')
try:
objects = json.loads(stream_or_string)
for obj in objects:
Model = _get_model(obj['model'])
if isinstance(obj['pk'], (tuple, list)):
o = Model.objects.get_by_natural_key(*obj['pk'])
obj['pk'] = o.pk
# If has generic fk's, find the generic object by natural key, and set it's
# pk according to it.
for virtual_field in Model._meta.virtual_fields:
if type(virtual_field) == GenericForeignKey:
natural_key_field_name = virtual_field.name + '_natural_key'
if natural_key_field_name in obj['fields']:
content_type = getattr(o, virtual_field.ct_field)
content_object_by_natural_key = content_type.model_class().\
objects.get_by_natural_key(obj['fields'][natural_key_field_name][0])
obj['fields'][virtual_field.fk_field] = content_object_by_natural_key.pk
for obj in PythonDeserializer(objects, **options):
yield obj
except GeneratorExit:
raise
except Exception as e:
# Map to deserializer error
six.reraise(DeserializationError, DeserializationError(e), sys.exc_info()[2])