Why does pg_restore take so much longer on Ubuntu than Windows?

Even one hour is very long for a small dump file of 21 MB. We are restoring databases of 2 GB compressed dump file in about 30 minutes but we might have better hardware ;-)

What you should read first:

http://www.postgresql.org/docs/8.4/static/populate.html

It is all about your problem. It tells you how to poulate a database fast.

Aditional tips:

  • First enable logging of all statements with duration time and see what's going on
  • Increase shared_buffers, default on ubuntu 10.10 is only 24 MB, see http://www.postgresql.org/docs/8.4/static/kernel-resources.html#SYSVIPC for configuring your linux system to accept higher values
  • use --format=custom or -Fc for dumping. It's the best choice
  • you can run pg_restore on multiple CPU with "-j" but i guess you have other problems that getting the last bits of performance

For further information:

  • Read the excellent online documentation
  • Buy Postgresql 9.0 High Performance (not regarding your problem, it is just an excellent book for experienced PostgreSQL DBA)

Edit: I missed that you said the dump is merely 21MB, and not even compressed. Even 1 hour is a very long time to restore that amount of data. Could you shed some light on what the dump contains? What sort of table structure, how many indexes and what kind? Functional indexes? GiST/GIN indexes? How much data is generated after the dump is restored?

The PostgreSQL mailing list might be a better place to discuss this.

Old post

The default PostgreSQL configuration is very conservative in terms of resource requirements. Which means that during bulk-loading, it has to perform very frequent checkpoints (your Postgres logs are probably full of checkpoint warnings).

I suspect that PostgreSQL on Windows might not correctly flush everything to disk, thus checkpoints don't affect performance much. If true, this is of course bad for database integrity.

If my assumptions are true, bumping checkpoint_segments up to 50 in the Ubuntu configuration should make it perform similarly to Windows. (There are lots of other tunables, but this is the most important one for bulk-loading)

Also, what does SHOW wal_sync_method say on your Ubuntu installation? It should be fdatasync for optimal performance, but some versions defaulted to open_datasync.