How to create an index to avoid sorting step used with DISTINCT?
PostgreSQL 9.2
I have the following table (tbl
):
-------------------------------------------------------------
| id | mailing_id | recipient_id | delivery_state_id |
-------------------------------------------------------------
| PK | integer | integer | integer |
-------------------------------------------------------------
Also, the I created the following index:
CREATE INDEX idx_name
ON tbl
USING btree
(recipient_id);
Since, indexes in posgtresql have default sorting, I expected the query
SELECT DISTINCT recipient_id
FROM tbl
can avoid the sort step. But running
EXPLAIN ANALYZE SELECT DISTINCT recipient_id
FROM mailing.mailing_recipient mr
show me that it can't:
Unique (cost=1401370.66..1442288.31 rows=145798 width=4) (actual time=9377.410..11388.869 rows=1037472 loops=1)
-> Sort (cost=1401370.66..1421829.48 rows=8183530 width=4) (actual time=9377.408..10849.160 rows=8183160 loops=1)
Sort Key: recipient_id
Sort Method: external merge Disk: 111968kB
-> Seq Scan on tbl (cost=0.00..126072.30 rows=8183530 width=4) (actual time=0.008..1073.771 rows=8183160 loops=1)
Total runtime: 11448.373 ms
As you can see, there's still sorting.
Question: How can I create the index to avoid sorting step?
Solution 1:
Make sure your order by statement matches your indexes exactly, including NULLS LAST (or FIRST) in your queries.
Solution 2:
This surprises me; I would expect Postgres to be smarter than that. What happens with this version?
SELECT recipient_id
FROM tbl
GROUP BY recipient_id;
What version of Postgres are you using? Postgres introduced index-only scans in version 9.2 (see here), which might explain the lack of use of the index. I can say that an index scan is used for distinct
in 9.3.
Here is an explain in 9.3 on a similar query (select distinct totalprice from orders
):
Unique (cost=0.42..5505.62 rows=2794 width=8)
-> Index Only Scan using idx_orders_totalprice on orders (cost=0.42..5023.16 rows=192983 width=8)"