Aggregate (and other annotated) fields in Django Rest Framework serializers
I am trying to figure out the best way to add annotated fields, such as any aggregated (calculated) fields to DRF (Model)Serializers. My use case is simply a situation where an endpoint returns fields that are NOT stored in a database but calculated from a database.
Let's look at the following example:
models.py
class IceCreamCompany(models.Model):
name = models.CharField(primary_key = True, max_length = 255)
class IceCreamTruck(models.Model):
company = models.ForeignKey('IceCreamCompany', related_name='trucks')
capacity = models.IntegerField()
serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
class Meta:
model = IceCreamCompany
desired JSON output:
[
{
"name": "Pete's Ice Cream",
"total_trucks": 20,
"total_capacity": 4000
},
...
]
I have a couple solutions that work, but each have some issues.
Option 1: add getters to model and use SerializerMethodFields
models.py
class IceCreamCompany(models.Model):
name = models.CharField(primary_key=True, max_length=255)
def get_total_trucks(self):
return self.trucks.count()
def get_total_capacity(self):
return self.trucks.aggregate(Sum('capacity'))['capacity__sum']
serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
def get_total_trucks(self, obj):
return obj.get_total_trucks
def get_total_capacity(self, obj):
return obj.get_total_capacity
total_trucks = SerializerMethodField()
total_capacity = SerializerMethodField()
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
The above code can perhaps be refactored a bit, but it won't change the fact that this option will perform 2 extra SQL queries per IceCreamCompany which is not very efficient.
Option 2: annotate in ViewSet.get_queryset
models.py as originally described.
views.py
class IceCreamCompanyViewSet(viewsets.ModelViewSet):
queryset = IceCreamCompany.objects.all()
serializer_class = IceCreamCompanySerializer
def get_queryset(self):
return IceCreamCompany.objects.annotate(
total_trucks = Count('trucks'),
total_capacity = Sum('trucks__capacity')
)
This will get the aggregated fields in a single SQL query but I'm not sure how I would add them to the Serializer as DRF doesn't magically know that I've annotated these fields in the QuerySet. If I add total_trucks and total_capacity to the serializer, it will throw an error about these fields not being present on the Model.
Option 2 can be made work without a serializer by using a View but if the model contains a lot of fields, and only some are required to be in the JSON, it would be a somewhat ugly hack to build the endpoint without a serializer.
Solution 1:
Possible solution:
views.py
class IceCreamCompanyViewSet(viewsets.ModelViewSet):
queryset = IceCreamCompany.objects.all()
serializer_class = IceCreamCompanySerializer
def get_queryset(self):
return IceCreamCompany.objects.annotate(
total_trucks=Count('trucks'),
total_capacity=Sum('trucks__capacity')
)
serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
total_trucks = serializers.IntegerField()
total_capacity = serializers.IntegerField()
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
By using Serializer fields I got a small example to work. The fields must be declared as the serializer's class attributes so DRF won't throw an error about them not existing in the IceCreamCompany model.
Solution 2:
I made a slight simplification of elnygreen's answer by annotating the queryset when I defined it. Then I don't need to override get_queryset()
.
# views.py
class IceCreamCompanyViewSet(viewsets.ModelViewSet):
queryset = IceCreamCompany.objects.annotate(
total_trucks=Count('trucks'),
total_capacity=Sum('trucks__capacity'))
serializer_class = IceCreamCompanySerializer
# serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
total_trucks = serializers.IntegerField()
total_capacity = serializers.IntegerField()
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
As elnygreen said, the fields must be declared as the serializer's class attributes to avoid an error about them not existing in the IceCreamCompany model.
Solution 3:
You can hack the ModelSerializer constructor to modify the queryset it's passed by a view or viewset.
class IceCreamCompanySerializer(serializers.ModelSerializer):
total_trucks = serializers.IntegerField(readonly=True)
total_capacity = serializers.IntegerField(readonly=True)
class Meta:
model = IceCreamCompany
fields = ('name', 'total_trucks', 'total_capacity')
def __new__(cls, *args, **kwargs):
if args and isinstance(args[0], QuerySet):
queryset = cls._build_queryset(args[0])
args = (queryset, ) + args[1:]
return super().__new__(cls, *args, **kwargs)
@classmethod
def _build_queryset(cls, queryset):
# modify the queryset here
return queryset.annotate(
total_trucks=...,
total_capacity=...,
)
There is no significance in the name _build_queryset
(it's not overriding anything), it just allows us to keep the bloat out of the constructor.