Django Query That Get Most Recent Objects From Different Categories

Solution 1:

Starting from Django 1.11 and thanks to Subquery and OuterRef, we can finally build a latest-per-group query using the ORM.

hottest_cakes = Cake.objects.filter(
    baked_at=Subquery(
        (Cake.objects
            .filter(bakery=OuterRef('bakery'))
            .values('bakery')
            .annotate(last_bake=Max('baked_at'))
            .values('last_bake')[:1]
        )
    )
)

#BONUS, we can now use this for prefetch_related()
bakeries = Bakery.objects.all().prefetch_related(
    Prefetch('cake_set',
        queryset=hottest_cakes,
        to_attr='hottest_cakes'
    )
)

#usage
for bakery in bakeries:
    print 'Bakery %s has %s hottest_cakes' % (bakery, len(bakery.hottest_cakes))

Solution 2:

As far as I know, there is no one-step way of doing this in Django ORM, but you can split it into two queries:

from django.db.models import Max

bakeries = Bakery.objects.annotate(
    hottest_cake_baked_at=Max('cake__baked_at')
) 
hottest_cakes = Cake.objects.filter(
    baked_at__in=[b.hottest_cake_baked_at for b in bakeries]
)

If id's of cakes are progressing along with bake_at timestamps, you can simplify and disambiguate the above code (in case two cakes arrives at the same time you can get both of them):

from django.db.models import Max

hottest_cake_ids = Bakery.objects.annotate(
    hottest_cake_id=Max('cake__id')
).values_list('hottest_cak‌​e_id', flat=True)

hottest_cakes = Cake.objects.filter(id__in=hottest_cake_ids)

BTW credits for this goes to Daniel Roseman, who once answered similar question of mine:

http://groups.google.pl/group/django-users/browse_thread/thread/3b3cd4cbad478d34/3e4c87f336696054?hl=pl&q=

If the above method is too slow, then I know also second method - you can write custom SQL producing only those Cakes, that are hottest in relevant Bakeries, define it as database VIEW, and then write unmanaged Django model for it. It's also mentioned in the above django-users thread. Direct link to the original concept is here:

http://web.archive.org/web/20130203180037/http://wolfram.kriesing.de/blog/index.php/2007/django-nice-and-critical-article#comment-48425

Hope this helps.

Solution 3:

If you happen to be using PostGreSQL, you can use Django's interface to DISTINCT ON:

recent_cakes = Cake.objects.order_by('bakery__id', '-baked_at').distinct('bakery__id')

As the docs say, you must order by the same fields that you distinct on. As Simon pointed out below, if you want to do additional sorting, you'll have to do it in Python-space.