how to filter only the rows containing first occurrence of an item in Django query
I have below data from my Django model
id Date value
0 1975 a
21 1975 b
1 1976 b
22 1976 c
3 1977 a
2 1977 b
4 1978 c
25 1978 d
5 1979 e
26 1979 f
6 1980 a
27 1980 f
Model
class Items(models.Model):
date = models.DateField()
value = models.CharField()
I am having trouble finding a way to keep only the lines containing the first occurrence of a'value'. I want to drop duplicate 'values', keeping the row with the lowest 'Date'.The end result should be:
id Date value
0 1975 a
21 1975 b
22 1976 c
25 1978 d
5 1979 e
26 1979 f
Solution 1:
You can filter by eliminating all items for which there is an earlier record:
from django.db.models import Exists, OuterRef
Items.objects.filter(
~Exists(Items.objects.filter(
Q(date__lt=OuterRef('date')) | Q(Date=OuterRef('date'), pk__lt=OuterRef('pk')),
value=OuterRef('value')
))
)
We thus check if there is a record with the same value
where the date
is smaller than our current record, or where the date is the same but the primary key is smaller (as tie breaker).
Note: normally a Django model is given a singular name, so
Items
instead of.Item