How do you join two tables on a foreign key field using django ORM?

Let's assume I have the following models:

class Position(models.Model):
    name = models.CharField()

class PositionStats(models.Model):
    position = models.ForeignKey(Position)
    averageYards = models.CharField()
    averageCatches = models.CharField()

class PlayerStats(models.Model):
    player = models.ForeignKey(Player)
    averageYards = models.CharField()
    averageCatches = models.CharField()

class Player(models.Model):
    name = models.CharField()
    position = models.ForeignKey(Position)

I want to perform the equivalent SQL query using django's ORM:

SELECT *

FROM PlayerStats

JOIN Player ON player

JOIN PositionStats ON PositionStats.position = Player.position

How would I do that with django's ORM? The query isn't exactly correct, but the idea is that I want a single query, using django's ORM, that gives me PlayerStats joined with PositionStats based on the player's position.


Solution 1:

I've been working with django for a while now and I have had a pretty rough time figuring out the table joins, but I think I finally understand and I would like to pass this on to others so they may avoid the frustration that I had with it.

Consider the following model.py:

class EventsMeetinglocation(models.Model):
    id = models.IntegerField(primary_key=True)
    name = models.CharField(max_length=100)
    address = models.CharField(max_length=200)

    class Meta:
        managed = True
        db_table = 'events_meetinglocation'

class EventsBoardmeeting(models.Model):
    id = models.IntegerField(primary_key=True)
    date = models.DateTimeField()
    agenda_id = models.IntegerField(blank=True, null=True)
    location_id = models.ForeignKey(EventsMeetinglocation)
    minutes_id = models.IntegerField(blank=True, null=True)

    class Meta:
       managed = True
       db_table = 'events_boardmeeting'

Here we can see that location_id in EventsBoardmeeting is a foreign key for the id in EventsMeetinglocation. This means that we should be able to query the information in EventsMeetinglocation by going through EventsBoardmeeting.

Now consider the following views.py:

def meetings(request):
    meetingData = EventsBoardmeeting.objects.all()
    return render(request, 'board/meetings.html', {'data': meetingData })

As stated many times before in may other posts, django takes care of joins automatically. When we query everything in EventsBoardmeeting we also get any related information by foreign key as well, But the way that we access this in html is a little different. We have to go through the variable used as the foreign key to access the information associated with that join. For example:

{% for x in data %}
   {{ x.location_id.name }}
{% endfor %}

The above references ALL of the names in the table that were the result of the join on foreign key. x is essentially the EventsBoardmeeting table, so when we access x.location_id we are accessing the foreign key which gives us access to the information in EventsMeetinglocation.

Solution 2:

It isn't one query, but it's pretty efficient. This does one query for each table involved, and joins them in Python. More on prefetch_related here: https://docs.djangoproject.com/en/dev/ref/models/querysets/#prefetch-related

Player.objects.filter(name="Bob").prefetch_related(
        'position__positionstats_set', 'playerstats_set')

Solution 3:

select_related() and prefetch_related() is your solution. They work almost same way but has some difference.

select_related() works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. But it only works for one-to-one or one-to-many relation. Example is below-

entry = Entry.objects.select_related('blog').get(id=5)
or
entries = Entry.objects.filter(foo='bar').select_related('blog')

prefetch_related(), on the other hand, does a separate lookup for each relationship and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related. So prefetch_related will execute only one query for each relation. Example is given below-

Pizza.objects.all().prefetch_related('toppings')

Solution 4:

In Django 3.2, the framework automatically follows relationships when using method QuerySet.filter()

# The API automatically follows relationships as far as you need.
# Use double underscores to separate relationships.
# This works as many levels deep as you want; there's no limit.
# Find all Choices for any question whose pub_date is in this year
# (reusing the 'current_year' variable we created above).
>>> Choice.objects.filter(question__pub_date__year=current_year)

This compiles to the following SQL query:

SELECT
    "polls_choice"."id",
    "polls_choice"."question_id",
    "polls_choice"."choice_text",
    "polls_choice"."votes"
FROM
    "polls_choice"
INNER JOIN "polls_question" ON
    ("polls_choice"."question_id" = "polls_question"."id")
WHERE
    "polls_question"."pub_date" BETWEEN 2020-12-31 23:00:00 AND 2021-12-31 22:59:59.999999

See tutorial here: https://docs.djangoproject.com/en/3.2/intro/tutorial02/