How do I force Postgres to use a particular index?

How do I force Postgres to use an index when it would otherwise insist on doing a sequential scan?


Assuming you're asking about the common "index hinting" feature found in many databases, PostgreSQL doesn't provide such a feature. This was a conscious decision made by the PostgreSQL team. A good overview of why and what you can do instead can be found here. The reasons are basically that it's a performance hack that tends to cause more problems later down the line as your data changes, whereas PostgreSQL's optimizer can re-evaluate the plan based on the statistics. In other words, what might be a good query plan today probably won't be a good query plan for all time, and index hints force a particular query plan for all time.

As a very blunt hammer, useful for testing, you can use the enable_seqscan and enable_indexscan parameters. See:

  • Examining index usage
  • enable_ parameters

These are not suitable for ongoing production use. If you have issues with query plan choice, you should see the documentation for tracking down query performance issues. Don't just set enable_ params and walk away.

Unless you have a very good reason for using the index, Postgres may be making the correct choice. Why?

  • For small tables, it's faster to do sequential scans.
  • Postgres doesn't use indexes when datatypes don't match properly, you may need to include appropriate casts.
  • Your planner settings might be causing problems.

See also this old newsgroup post.


Probably the only valid reason for using

set enable_seqscan=false

is when you're writing queries and want to quickly see what the query plan would actually be were there large amounts of data in the table(s). Or of course if you need to quickly confirm that your query is not using an index simply because the dataset is too small.


TL;DR

Run the following three commands and check whether the problem is fixed:

ANALYZE;
SET random_page_cost = 1.0;
SET effective_cache_size = 'X GB';    # replace X with total RAM size minus 2 GB

Read on for further details and background information about this.

Step 1: Analyze tables

As a simple first attempt to fix the issue, run the ANALYZE; command as the database superuser in order to update all table statistics. From the documentation:

The query planner uses these statistics to help determine the most efficient execution plans for queries.

Step 2: Set the correct random page cost

Index scans require non-sequential disk page fetches. PostgreSQL uses the random_page_cost configuration parameter to estimate the cost of such non-sequential fetches in relation to sequential fetches. From the documentation:

Reducing this value [...] will cause the system to prefer index scans; raising it will make index scans look relatively more expensive.

The default value is 4.0, thus assuming an average cost factor of 4 compared to sequential fetches, taking caching effects into account. However, if your database is stored on an SSD drive, then you should actually set random_page_cost to 1.1 according to the documentation:

Storage that has a low random read cost relative to sequential, e.g., solid-state drives, might also be better modeled with a lower value for random_page_cost, e.g., 1.1.

Also, if an index is mostly (or even entirely) cached in RAM, then an index scan will always be significantly faster than a disk-served sequential scan. The query planner however doesn't know which parts of the index are already cached, and thus might make an incorrect decision.

If your database indices are frequently used, and if the system has sufficient RAM, then the indices are likely to be cached eventually. In such a case, random_page_cost can be set to 1.0, or even to a value below 1.0 to aggressively prefer using index scans (although the documentation advises against doing that). You'll have to experiment with different values and see what works for you.

As a side note, you could also consider using the pg_prewarm extension to explicitly cache your indices into RAM.

You can set the random_page_cost like this:

SET random_page_cost = 1.0;

Step 3: Set the correct cache size

On a system with 8 or more GB RAM, you should set the effective_cache_size configuration parameter to the amount of memory which is typically available to PostgreSQL for data caching. From the documentation:

A higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be used.

Note that this parameter doesn't change the amount of memory which PostgreSQL will actually allocate, but is only used to compute cost estimates. A reasonable value (on a dedicated database server, at least) is the total RAM size minus 2 GB. The default value is 4 GB.

You can set the effective_cache_size like this:

SET effective_cache_size = '14 GB';   # e.g. on a dedicated server with 16 GB RAM

Step 4: Fix the problem permanently

You probably want to use ALTER SYSTEM SET ... or ALTER DATABASE db_name SET ... to set the new configuration parameter values permanently (either globally or per-database). See the documentation for details about setting parameters.

Step 5: Additional resources

If it still doesn't work, then you might also want to take a look at this PostgreSQL Wiki page about server tuning.