PostgreSQL Index Usage Analysis

Is there a tool or method to analyze Postgres, and determine what missing indexes should be created, and which unused indexes should be removed? I have a little experience doing this with the "profiler" tool for SQLServer, but I'm not aware of a similar tool included with Postgres.


Solution 1:

I like this to find missing indexes:

SELECT
  relname                                               AS TableName,
  to_char(seq_scan, '999,999,999,999')                  AS TotalSeqScan,
  to_char(idx_scan, '999,999,999,999')                  AS TotalIndexScan,
  to_char(n_live_tup, '999,999,999,999')                AS TableRows,
  pg_size_pretty(pg_relation_size(relname :: regclass)) AS TableSize
FROM pg_stat_all_tables
WHERE schemaname = 'public'
      AND 50 * seq_scan > idx_scan -- more than 2%
      AND n_live_tup > 10000
      AND pg_relation_size(relname :: regclass) > 5000000
ORDER BY relname ASC;

This checks if there are more sequence scans than index scans. If the table is small, it gets ignored, since Postgres seems to prefer sequence scans for them.

Above query does reveal missing indexes.

The next step would be to detect missing combined indexes. I guess this is not easy, but doable. Maybe analyzing the slow queries ... I heard pg_stat_statements could help...

Solution 2:

Check the statistics. pg_stat_user_tables and pg_stat_user_indexes are the ones to start with.

See "The Statistics Collector".

Solution 3:

On the determine missing indexes approach....Nope. But there's some plans to make this easier in future release, like pseudo-indexes and machine readable EXPLAIN.

Currently, you'll need to EXPLAIN ANALYZE poor performing queries and then manually determine the best route. Some log analyzers like pgFouine can help determine the queries.

As far as an unused index, you can use something like the following to help identify them:

select * from pg_stat_all_indexes where schemaname <> 'pg_catalog';

This will help identify tuples read, scanned, fetched.