Ubuntu's garbage collection cron job for PHP sessions takes 25 minutes to run, why?

Ubuntu has a cron job set up which looks for and deletes old PHP sessions:

# Look for and purge old sessions every 30 minutes
09,39 *     * * *     root   [ -x /usr/lib/php5/maxlifetime ] \
   && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 \
   -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir \
   fuser -s {} 2> /dev/null \; -delete

My problem is that this process is taking a very long time to run, with lots of disk IO. Here's my CPU usage graph:

CPU usage graph

The cleanup running is represented by the teal spikes. At the beginning of the period, PHP's cleanup jobs were scheduled at the default 09 and 39 minutes times. At 15:00 I removed the 39 minute time from cron, so a cleanup job twice the size runs half as often (you can see the peaks get twice as wide and half as frequent).

Here are the corresponding graphs for IO time:

IO time

And disk operations:

Disk operations

At the peak where there were about 14,000 sessions active, the cleanup can be seen to run for a full 25 minutes, apparently using 100% of one core of the CPU and what seems to be 100% of the disk IO for the entire period. Why is it so resource intensive? An ls of the session directory /var/lib/php5 takes just a fraction of a second. So why does it take a full 25 minutes to trim old sessions? Is there anything I can do to speed this up?

The filesystem for this device is currently ext4, running on Ubuntu Precise 12.04 64-bit.

EDIT: I suspect that the load is due to the unusual process "fuser" (since I expect a simple rm to be a damn sight faster than the performance I'm seeing). I'm going to remove the use of fuser and see what happens.


Solution 1:

Removing of fuser should help. This job runs a fuser command (check if a file is currently opened) for every session file found, which can easily take several minutes on a busy system with 14k sessions. This was a Debian bug (Ubuntu is based on Debian).

Instead of memcached you can also try to use tmpfs (a filesystem in memory) for session files. Like memcached this would invalidate sessions on reboot (this can be worked around by backing up this directory somewhere in shutdown script and restoring in startup script), but will be much easier to setup. But it will not help with fuser problem.

Solution 2:

Congratulations on having a popular web site and managing to keep it running on a virtual machine for all this time.

If you're really pulling in two million pageviews per day, then you're going to stack up a LOT of PHP sessions in the filesystem, and they're going to take a long time to delete no matter whether you use fuser or rm or a vacuum cleaner.

At this point I'd recommend you look into alternate ways to store your sessions:

  • One option is to store sessions in memcached. This is lightning fast, but if the server crashes or restarts, all your sessions are lost and everyone is logged out.
  • You can also store sessions in a database. This would be a bit slower than memcached, but the database would be persistent, and you could clear old sessions with a simple SQL query. To implement this, though, you have to write a custom session handler.

Solution 3:

So, the Memcached and database session storage options suggested by users here are both good choices to increase performance, each with their own benefits and drawbacks.

But by performance testing, I found that the huge performance cost of this session maintenance is almost entirely down to the call to fuser in the cron job. Here's the performance graphs after reverting to the Natty / Oneiric cron job which uses rm instead of fuser to trim old sessions, the switchover happens at 2:30.

CPU usage

Elapsed IO time

Disk operations

You can see that the periodic performance degradation caused by Ubuntu's PHP session cleaning is almost entirely removed. The spikes shown in the Disk Operations graph are now much smaller in magnitude, and about as skinny as this graph can possibly measure, showing a small, short disruption where previously server performance was significantly degraded for 25 minutes. Extra CPU usage is entirely eliminated, this is now an IO-bound job.

(an unrelated IO job runs at 05:00 and CPU job runs at 7:40 which both cause their own spikes on these graphs)

The modified cron job I'm now running is:

09 *     * * *     root   [ -x /usr/lib/php5/maxlifetime ] && \
   [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 \
   -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 \
   | xargs -n 200 -r -0 rm

Solution 4:

I came across this post when doing some research on sessions. While the accepted answer is very good (and the fuser call has been removed from the gc script for some time) I think its worth noting a few other considerations should anyone else come across a similar issue.

In the scenario described, the OP was using ext4. Directories in ext4 store file data in an htree database format - which means there is negligible impact in holding lots of files in a single directory compared with distributing them across mutliple directories. This is not true of all filesystems. The default handler in PHP allows you to use multiple sub-directories for session files (but note that you should check that the controlling process is recursing into those directories - the cron job above does not).

A lot of the cost of the operation (after removing the call to fuser) arises from looking at files which are not yet stale. Using (for example) a single level of subdirectories, and 16 cron jobs looking in each sub directory ( 0/, 1/, ...d/, e/, f/) will smooth out the load bumps arising.

Using a custom session handler with a faster substrate will help - but there's lots to choose from (memcache, redis, mysql handler socket...) leaving aside the range in quality of those published on the internet, which you choose depends on the exact requirements with regard to your application, infrastructure and skills, not to forget that there are frequently differences in the handling of semantics (notably locking) compared with the default handler.