Why is Postgres performance greatly reduced upgrading from 9.2.24 to 10.17?

Problem

I have the query,SELECT p.* FROM parties p INNER JOIN bib b ON p.id=b.id;.

On Postgres version 9.2.24, this query takes 12 or 13 minutes. Here is the output of EXPLAIN ANALYZE on said query:

                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=156455.80..24854275.23 rows=4608791 width=46) (actual time=2516.474..407029.156 rows=4556073 loops=1)
   Hash Cond: (p.id = b.pid)
   ->  Seq Scan on parties p  (cost=0.00..22882138.64 rows=128697664 width=46) (actual time=371.099..220812.052 rows=133719944 loops=1)
   ->  Hash  (cost=98845.91..98845.91 rows=4608791 width=4) (actual time=2141.741..2141.741 rows=4608467 loops=1)
         Buckets: 524288  Batches: 1  Memory Usage: 162017kB
         ->  Seq Scan on bib b  (cost=0.00..98845.91 rows=4608791 width=4) (actual time=21.570..1199.429 rows=4608467 loops=1)
 Total runtime: 407293.833 ms
(7 rows)

On Postgres version 10.17, this query takes roughly 3 hours. Here is the output of EXPLAIN ANALYZE on said query:

                                                                           QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.57..12597544.56 rows=10945244 width=1038) (actual time=49.143..31536008.494 rows=10854924 loops=1)
   Workers Planned: 5
   Workers Launched: 5
   ->  Nested Loop  (cost=0.57..11502020.16 rows=2189049 width=1038) (actual time=88.401..31499828.808 rows=1809154 loops=6)
         ->  Parallel Seq Scan on bib b  (cost=0.00..147545.49 rows=2189049 width=4) (actual time=3.113..1830.876 rows=1824207 loops=6)
         ->  Index Scan using idx_parties_cmp on parties p  (cost=0.57..5.18 rows=1 width=1038) (actual time=17.262..17.265 rows=1 loops=10945244)
               Index Cond: (id = b.id)
 Planning time: 114.828 ms
 Execution time: 31536841.489 ms
(9 rows)

Notes

  • Both databases have similar data
  • Both databases have the same indexes
  • Both databases have the same configuration
  • Both databases are running on the same amount of ram - 125 GB
  • The 9.2.24 database has more swap (11 GB vs 3 GB on 10.17)
  • I have run vacuumdb --all --analyze-in-stages on both

Questions

  • Why is the query taking drastically longer in Postgres 10.17? I realize that the second query returns ~6 million more rows, but it takes 525 minutes vs 7 minutes. Could a difference of 6 million really account for a 75X reduction in performance?

  • Is cost measured the same between Postgres versions? Because the cost for the query on 10.17 is roughly half the cost of the query on version 9.2, but it takes roughly 30x as long.


The cost estimation has changed between 9.2 and 10, but not radically (assuming the configurable settings have not been changed radically--something we can't know except by you telling us).

The biggest difference (of those probably relevant to you here) is the introduction of within-query parallelization which of course required changes to the cost model. v10 thinks the total cost will be 2 fold less, but it also thinks it will have 6 processes dedicated to it, so it thinks in total it will be using 3 times more resources. It probably is not getting any benefit from parallelization (because your hardware is inadequate), but it incorrectly thinks it will.

Note that the default setting of max_parallel_workers_per_gather in v10 is 2, but you are obviously running with a setting of at least 5. In other words, someone setting up your server apparently went out of their way to lie to PostgreSQL about how capable the server was.

Now clearly that is not the only estimation problem. If we say it takes 75 times longer to return twice as much data, and forgive it a factor of 6 in parallelization credits, that still leaves it wrong by about another factor of 6. But for all we know, it was also off by that amount even in 9.2. It could be that the 6 fold estimation error still left the nested loop looking more expensive, so it wasn't chosen. And only when that error was combined with the parallel misestimation did the combined error become high enough to switch the plan. The way to know for sure would be to force 9.2 to use the nested loop plan, and see how much slower it actually is, and how much costlier it is expected to be. Of course if the purpose of the exercise is "how do I fix this" rather than "who do I blame for this", then doing that work against 9.2 might be pointless.