Troubleshooting a Redis Stall

Solution 1:

What is your setting for /proc/sys/vm/zone_reclaim? Try setting it to 0. There's plenty of stuff on the net if you search for 'zone_reclaim', so I won't try to rehash it here.

Solution 2:

When Redis forks to checkpoint, the Linux kernel needs to duplicate the mapping tables for copy on write. If you have a lot of RAM, this can take a lot of time. We have a 200 GB Redis instance that takes 8 seconds to fork, and the machine is deaf to the world while this happens.

Workarounds (from easy to hard):

checkpoint less often, increasing the time and key count before checkpoint
shard your data into multiple process instances, each of which uses less RAM
try aof instead of checkpoint, although this will fork occasionally anyway
try huge pages, although you may need to double your physical RAM because approximately everything will be dirtied while checkpointing
screw it and go with Postgres

Troubleshooting a Redis Stall

Solution 1:

Solution 2:

Related

Recent Posts