How to cleanup tmp folder safely on Linux
I use RAM for my tmpfs /tmp, 2GB, to be exact. Normally, this is enough but sometimes, processes create files in there and fail to cleanup after themselves. This can happen if they crash. I need to delete these orphaned tmp files or else future process will run out of space on /tmp.
How can I safely garbage collect /tmp? Some people do it by checking last modification timestamp, but this approach is unsafe because there can be long-running processes that still need those files. A safer approach is to combine the last modification timestamp condition with the condition that no process has a file handle for the file. Is there a program/script/etc that embodies this approach or some other approach that is also safe?
Incidentally, does Linux/Unix allow a mode of file opening with creation wherein the created file is deleted when the creating process terminates, even if it's from a crash?
Solution 1:
You might want to try something like that:
find /tmp -mtime +7 -and -not -exec fuser -s {} ';' -and -exec echo {} ';'
find is used to find files that match certain criteria.
-
-mtime +7
only selects files that are older than 7 days (you may use any other value) -
-exec fuser -s {} ';'
calls fuser in silent mode for every file that matches the oldness criteria. fuser returns 0 (=true) for every file that's been accessed right now and 1 (= false) for the unaccessed ones. As we are only interested in the unaccessed ones, we put a-not
in front of this-exec
-
-exec echo {} ';'
just prints all file names matching the criteria. you might want use-exec rm {} ';'
instead here, but as this may delete some still-in-use files, I think it's safer to do a simple echo first. -
edit: You might want to add something like
-name 'foo*.bar'
or-uid 123
to limit the effects of the cleanup to specific file patterns or user IDs to avoid accidental effects.
To the last point: Consider that there might be files that are only written once (e.g. at system boot) but read frequently (e.g. any X-session-cookie). Therefore I recommend adding some name checks to only affect files created by your faulty programs.
edit2: To your last question: A file won't get deleted from disk until no process has an open handle to it (at least for native linux filesystems). The problem is that the directory entry is removed immediately which means that from the time you remove the file no new processes can open the file anymore (as there's no filename attached to it).
For details see: https://stackoverflow.com/questions/3181641/how-can-i-delete-a-file-upon-its-close-in-c-on-linux
edit3: But what if I wanted to automate the whole process?
As I said, there might be files that are written once and then read every once in a while (e.g. X session cookies, PID files, etc.). Those won't be excluded by this little removal script (which is the reason why you might wanna do a test run with echo
first before actually deleting files).
One way to implement a safe solution is to use atime
.atime
stores the time each file was last accessed. But that file system option often is disabled because it has quite some performance impact (according to this blog somewhere in the 20-30% region). There's relatime
, but that one only writes the access time if mtime
has changed, so this one won't help us.
If you want to use atime
, I'd recommend to have /tmp
on a separate partition (ideally a ramdisk) so that the performance impact on the whole system isn't too big.
Once atime
is enabled, all you have to do is to replace the -mtime
parameter in the above command line with -atime
.
You might be able to remove the -not -exec fuser -s {} ';'
, but I'd keep it there just to be sure (in case applications keep files open for a long period of time).
But keep in mind to test the command using echo
before you end up removing stuff your system still needs!
Solution 2:
Don't roll your own.
Debian/Ubuntu have tmpreaper, it's probably available in other dists as well.
# tmpreaper - cleans up files in directories based on their age
sudo apt-get install tmpreaper
cat /etc/tmpreaper.conf