Gracefully deleting files older than 30 days

I have a cache folder with minimum of 15000 files.

I tried this:

find cache* -mtime +30 -exec rm {} \;

But this made my server load fly to the skies!

Is there any faster/better solution?

Or can I limit speed or iterations of this command?


Solution 1:

I like to use tmpwatch for these things, this is for the last time the file was modifiyed. It's simple and works well in many cases:

tmpwatch -m 720 /path/to/cache

For Ubuntu, check tmpreaper instead.

If you want to check the last time the file was accessed than you use the following:

tmpwatch -a 720 /path/to/cache

You are not able to use tmpwatch -a on file systems mounted with noatime. you can still use -m

Solution 2:

You could avoid the spawning of a new process for each file by using

find cache* -mtime +30 -delete

Solution 3:

Try running the above with nice:

nice -n 39 find cache* -mtime +30 -exec rm -f {} ';'

That way the huge load will only appear if nothing else needs to run, otherwise the other processes will take precedence (if their niceness is lower than 19 i.e. the maximum).

Note that the argument to the -n option is added to the default niceness which varies between -20 and 19. I used 39 so that it will be very nice regardless of what original niceness there was.

Solution 4:

As commented by chiborg, the load is due to starting rm for every file found. I noticed the answer where tmpwatch is suggested as an alternative, which I'm sure works well. However, it is not necessary.

Find can run the command given to exec once, if you tell it to accumulate the found files into a list of arguments like so:

find /path -name "*.moo" -exec rm {} \+

This may sometimes fail to work because the argument list may grow larger (in bytes) than the maximum allowed by the shell (getconf ARG_MAX). This may be solved by xargs with the -L option.

consider this example:

$ echo 0 > /tmp/it; 
$ for i in {0..15000};do echo $i;done  |\
    xargs --no-run-if-empty -L 5000 ./tmp/xr.sh 
Iteration=0; running with 5000 arguments
Iteration=1; running with 5000 arguments
Iteration=2; running with 5000 arguments
Iteration=3; running with 1 arguments

$ cat tmp/xr.sh 
#!/bin/sh
IT=`cat /tmp/it`
echo Iteration=$IT\; running with $# arguments
let IT=IT+1
echo $IT > /tmp/it

So there is no need install extra software, all you need is in gnu-findutils:

find /path -mtime +30 -print0 | xargs -0 -L 5000 rm