Django full text search using indexes with PostgreSQL
You need to add a SearchVectorField
to your MyEntry
, update it from your actual text fields and then perform the search on this field. However, the update can only be performed after the record has been saved to the database.
Essentially:
from django.contrib.postgres.indexes import GinIndex
from django.contrib.postgres.search import SearchVector, SearchVectorField
class MyEntry(models.Model):
# The fields that contain the raw data.
very_important_field = models.TextField(blank=True, null=True)
also_important_field = models.TextField(blank=True, null=True)
not_so_important_field = models.TextField(blank=True, null=True)
not_important_field = models.TextField(blank=True, null=True)
tags = models.TextField(blank=True, null=True)
# The field we actually going to search.
# Must be null=True because we cannot set it immediately during create()
search_vector = SearchVectorField(editable=False, null=True)
class Meta:
# The search index pointing to our actual search field.
indexes = [GinIndex(fields=["search_vector"])]
Then you can create the plain instance as usual, for example:
# Does not set MyEntry.search_vector yet.
my_entry = MyEntry.objects.create(
very_important_field="something very important", # Fake Italien text ;-)
also_important_field="something different but equally important"
not_so_important_field="this one matters less"
not_important_field="we don't care are about that one at all"
tags="things, stuff, whatever"
Now that the entry exists in the database, you can update the search_vector
field using all kinds of options. For example weight
to specify the importance and config
to use one of the default language configurations. You can also completely omit fields you don't want to search:
# Update search vector on existing database record.
my_entry.search_vector = (
SearchVector("very_important_field", weight="A", config="italien")
+ SearchVector("also_important_field", weight="A", config="italien")
+ SearchVector("not_so_important_field", weight="C", config="italien")
+ SearchVector("tags", weight="B", config="italien")
)
my_entry.save()
Manually updating the search_vector
field every time some of the text fields change can be error prone, so you might consider adding an SQL trigger to do that for you using a Django migration. For an example on how to do that see for instance a blog article on Full-text Search with Django and PostgreSQL.
To actually search in MyEntry
using the index you need to filter and rank by your search_vector
field. The config
for the SearchQuery
should match the one of the SearchVector
above (to use the same stopword, stemming etc).
For example:
from django.contrib.postgres.search import SearchQuery, SearchRank
from django.core.exceptions import ValidationError
from django.db.models import F, QuerySet
search_query = SearchQuery("important", search_type="websearch", config="italien")
search_rank = SearchRank(F("search_vector"), search_query)
my_entries_found = (
MyEntry.objects.annotate(rank=search_rank)
.filter(search_vector=search_query) # Perform full text search on index.
.order_by("-rank") # Yield most relevant entries first.
)