PHP FPM keeps hanging

I have tried all sorts of options but every couple days (mostly every day) FPM stops serving pages and I get a 502 (from cherokee webserver)

The logs are filled with the following:

[15-Sep-2014 10:17:46] WARNING: [pool www] child 10135 exited on signal 11 (SIGSEGV - core dumped) after 15.512406 seconds from start
[15-Sep-2014 10:17:46] NOTICE: [pool www] child 10138 started
[15-Sep-2014 10:18:02] WARNING: [pool www] child 10138 exited on signal 11 (SIGSEGV - core dumped) after 15.657950 seconds from start
[15-Sep-2014 10:18:02] NOTICE: [pool www] child 10166 started
[15-Sep-2014 10:18:20] WARNING: [pool www] child 10212 exited on signal 11 (SIGSEGV - core dumped) after 10.192596 seconds from start
[15-Sep-2014 10:18:20] NOTICE: [pool www] child 10214 started
[15-Sep-2014 10:19:08] WARNING: [pool www] child 10216 exited on signal 11 (SIGSEGV - core dumped) after 42.754452 seconds from start
[15-Sep-2014 10:19:08] NOTICE: [pool www] child 10242 started
[15-Sep-2014 10:20:22] WARNING: [pool www] child 10332 exited on signal 11 (SIGSEGV - core dumped) after 14.862183 seconds from start
[15-Sep-2014 10:20:22] NOTICE: [pool www] child 10494 started
[15-Sep-2014 10:20:48] WARNING: [pool www] child 10494 exited on signal 11 (SIGSEGV - core dumped) after 26.415409 seconds from start
[15-Sep-2014 10:20:48] NOTICE: [pool www] child 10498 started
[15-Sep-2014 10:32:48] WARNING: [pool www] child 11718 exited on signal 11 (SIGSEGV - core dumped) after 21.319360 seconds from start
[15-Sep-2014 10:32:48] NOTICE: [pool www] child 11720 started

And every time this happens the last log is similar to

[15-Sep-2014 11:01:34] WARNING: [pool www] server reached max_children setting (50), consider raising it

This is the connections according to cherokee at the same time, its not even a spike... enter image description here

I have tried dynamic, fixed ondemand and nothing changes. no matter what max_children I set it eventually dies.

Why it cant just recover I don't know, but getting to the point now of either switching to something else or making a crontab that restarts FPM every 30min

server

  • rackspace 1st gen 1024 MB RAM, 40 GB Disk
  • Ubuntu 12.04 LTS
  • cherokee 1.2.103

PHP 5.3.10-1ubuntu3.11 with Suhosin-Patch (cli) (built: Apr 4 2014 01:30:04) Copyright (c) 1997-2012 The PHP Group Zend Engine v2.3.0, Copyright (c) 1998-2012 Zend Technologies

Site gets around 2k page views pm so its not even such a big load.

Memory usage hovers at around 300 -> 400mb, swap is empty, load average is < ~1.5

fpm config

[global]
pid = /var/run/php5-fpm.pid
error_log = /var/log/php5-fpm.log
emergency_restart_threshold = 5
emergency_restart_interval = 1s
process.max = 75


include=/etc/php5/fpm/pool.d/*.conf

pool config

[www]
user = www-data
group = www-data
listen = 127.0.0.1:9000

pm = ondemand
pm.max_children = 50
pm.start_servers = 3
pm.min_spare_servers = 2
pm.max_spare_servers = 6
pm.process_idle_timeout = 10s

pm.max_requests = 100
pm.status_path = /status

ping.path = /fpm/ping

chdir = /

Solution 1:

Increasing the number of servers, changing your config or your code are not going to help with a segmentation fault. Even in 2014, 5.3.10 was long in the tooth and due an upgrade. You could analyse the core dumps with gdb, but nobody will be very interested in fixing a bug in an old version of php : upgrade.

Solution 2:

I wonder if you are hitting the following bug

https://bugs.php.net/bug.php?id=62205

Might try an upgrade of php