Aggregate (and other annotated) fields in Django Rest Framework serializers

I am trying to figure out the best way to add annotated fields, such as any aggregated (calculated) fields to DRF (Model)Serializers. My use case is simply a situation where an endpoint returns fields that are NOT stored in a database but calculated from a database.

Let's look at the following example:

models.py

class IceCreamCompany(models.Model):
    name = models.CharField(primary_key = True, max_length = 255)

class IceCreamTruck(models.Model):
    company = models.ForeignKey('IceCreamCompany', related_name='trucks')
    capacity = models.IntegerField()

serializers.py

class IceCreamCompanySerializer(serializers.ModelSerializer):
    class Meta:
        model = IceCreamCompany

desired JSON output:

[

    {
        "name": "Pete's Ice Cream",
        "total_trucks": 20,
        "total_capacity": 4000
    },
    ...
]

I have a couple solutions that work, but each have some issues.

Option 1: add getters to model and use SerializerMethodFields

models.py

class IceCreamCompany(models.Model):
    name = models.CharField(primary_key=True, max_length=255)

    def get_total_trucks(self):
        return self.trucks.count()

    def get_total_capacity(self):
        return self.trucks.aggregate(Sum('capacity'))['capacity__sum']

serializers.py

class IceCreamCompanySerializer(serializers.ModelSerializer):

    def get_total_trucks(self, obj):
        return obj.get_total_trucks

    def get_total_capacity(self, obj):
        return obj.get_total_capacity

    total_trucks = SerializerMethodField()
    total_capacity = SerializerMethodField()

    class Meta:
        model = IceCreamCompany
        fields = ('name', 'total_trucks', 'total_capacity')

The above code can perhaps be refactored a bit, but it won't change the fact that this option will perform 2 extra SQL queries per IceCreamCompany which is not very efficient.

Option 2: annotate in ViewSet.get_queryset

models.py as originally described.

views.py

class IceCreamCompanyViewSet(viewsets.ModelViewSet):
    queryset = IceCreamCompany.objects.all()
    serializer_class = IceCreamCompanySerializer

    def get_queryset(self):
        return IceCreamCompany.objects.annotate(
            total_trucks = Count('trucks'),
            total_capacity = Sum('trucks__capacity')
        )

This will get the aggregated fields in a single SQL query but I'm not sure how I would add them to the Serializer as DRF doesn't magically know that I've annotated these fields in the QuerySet. If I add total_trucks and total_capacity to the serializer, it will throw an error about these fields not being present on the Model.

Option 2 can be made work without a serializer by using a View but if the model contains a lot of fields, and only some are required to be in the JSON, it would be a somewhat ugly hack to build the endpoint without a serializer.


Solution 1:

Possible solution:

views.py

class IceCreamCompanyViewSet(viewsets.ModelViewSet):
    queryset = IceCreamCompany.objects.all()
    serializer_class = IceCreamCompanySerializer

    def get_queryset(self):
        return IceCreamCompany.objects.annotate(
            total_trucks=Count('trucks'),
            total_capacity=Sum('trucks__capacity')
        )

serializers.py

class IceCreamCompanySerializer(serializers.ModelSerializer):
    total_trucks = serializers.IntegerField()
    total_capacity = serializers.IntegerField()

    class Meta:
        model = IceCreamCompany
        fields = ('name', 'total_trucks', 'total_capacity')

By using Serializer fields I got a small example to work. The fields must be declared as the serializer's class attributes so DRF won't throw an error about them not existing in the IceCreamCompany model.

Solution 2:

I made a slight simplification of elnygreen's answer by annotating the queryset when I defined it. Then I don't need to override get_queryset().

# views.py
class IceCreamCompanyViewSet(viewsets.ModelViewSet):
    queryset = IceCreamCompany.objects.annotate(
            total_trucks=Count('trucks'),
            total_capacity=Sum('trucks__capacity'))
    serializer_class = IceCreamCompanySerializer

# serializers.py
class IceCreamCompanySerializer(serializers.ModelSerializer):
    total_trucks = serializers.IntegerField()
    total_capacity = serializers.IntegerField()

    class Meta:
        model = IceCreamCompany
        fields = ('name', 'total_trucks', 'total_capacity')

As elnygreen said, the fields must be declared as the serializer's class attributes to avoid an error about them not existing in the IceCreamCompany model.

Solution 3:

You can hack the ModelSerializer constructor to modify the queryset it's passed by a view or viewset.

class IceCreamCompanySerializer(serializers.ModelSerializer):
    total_trucks = serializers.IntegerField(readonly=True)
    total_capacity = serializers.IntegerField(readonly=True)

    class Meta:
        model = IceCreamCompany
        fields = ('name', 'total_trucks', 'total_capacity')

    def __new__(cls, *args, **kwargs):
        if args and isinstance(args[0], QuerySet):
              queryset = cls._build_queryset(args[0])
              args = (queryset, ) + args[1:]
        return super().__new__(cls, *args, **kwargs)

    @classmethod
    def _build_queryset(cls, queryset):
         # modify the queryset here
         return queryset.annotate(
             total_trucks=...,
             total_capacity=...,
         )

There is no significance in the name _build_queryset (it's not overriding anything), it just allows us to keep the bloat out of the constructor.