PostgreSQL slow commit performance
The server has incredibly, unspeakably, amazingly slow fsync
performance. There's something very badly wrong with your software RAID 1 setup. The terrible fsync
performance is almost certainly the cause of your performance problems.
The desktop merely has very slow fsync
.
You can work around the performance issues at the cost of losing some data after a crash by setting synchronous_commit = off
and setting a commit_delay
. You really need to sort out the disk performance on the server, though, that's jaw-droppingly slow.
For comparison, here's what I get on my laptop (i7, 8GB RAM, mid-range 128G SSD, pg_test_fsync from 9.2):
Compare file sync methods using one 8kB write:
open_datasync 4445.744 ops/sec
fdatasync 4225.793 ops/sec
fsync 2742.679 ops/sec
fsync_writethrough n/a
open_sync 2907.265 ops/sec
Admittedly this SSD probably isn't hard-power-loss-safe, but it's not like a decent power-fail-safe SSD costs a great deal when we're talking server costs.
This is pg_test_fsync
output on my server, with very similar configuration — Linux software RAID1 on 2 consumer-grade disks (WD10EZEX-00RKKA0
) :
# ./pg_test_fsync -s 3
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 115.375 ops/sec
fdatasync 109.369 ops/sec
fsync 27.081 ops/sec
fsync_writethrough n/a
open_sync 112.042 ops/sec
...
You did test this on completely idle server, do you?
Maybe you have unaligned partitions. Check:
parted /dev/sda unit s print
This is the output of my server:
Model: ATA WDC WD10EZEX-00R (scsi)
Disk /dev/sda: 1953525168s
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Number Start End Size Type File system Flags
1 2048s 67110911s 67108864s primary ext4 boot, raid
2 67110912s 603981823s 536870912s primary raid
3 603981824s 608176127s 4194304s primary linux-swap(v1)
4 608176128s 1953523711s 1345347584s primary raid
Check that each number in Start
column is divisible by 2048 (which means 1MiB). For good 4096B alignment divisible by 4 would suffice, but alignment aware utilities start partitions at 1MiB boundaries.
Also maybe you're using non-default mount options, like data=journal
, which have big impact on performance. Show your: mount -v | grep ^/dev/
. This is mine:
/dev/md0 on / type ext4 (rw,barrier,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0)
/dev/md2 on /home type ext4 (rw,barrier,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0)
/dev/md1 on /var type ext4 (rw,barrier,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0)
Maybe one of your disks is broken. Create one partition on each disk with no RAID (maybe you reserved some swap partitions on both disks - use these - there's no use for RAID on swap anyway). Create filesystems there and run pg_test_fsync
on each drive - if one has problems then a good one would have to wait for it when both are mirrored.
Check that your BIOS is set to use AHCI mode instead of IDE mode. A server would benefit from Native Command Queuing, which is not available in IDE mode.
Ignore comparison with SSD. It's ridiculous to compare.