How to delete millions of files without disturbing the server
I'd like to delete an nginx cache directory, which I quickly purged by:
mv cache cache.bak
mkdir cache
service nginx restart
Now I have a cache.bak
folder which has 2 million files. I'd like to delete it, without disturbing the server.
A simple rm -rf cache.bak
trashes the server, even the simplest HTTP response takes 16 seconds while rm is running, so I cannot do that.
I tried ionice -c3 rm -rf cache.bak
, but it didn't help. The server has an HDD, not an SSD, probably on an SSD these might not be a problem.
I believe the best solution would be some kind of throttling, like how nginx's built in cache manager does.
How would you solve this? Is there any tool which can do exactly this?
ext4 on Ubuntu 16.04
Solution 1:
Make a bash script like this:
#!/bin/bash
rm -- "$*"
sleep 0.5
Save it with name deleter.sh
for example. Run chmod u+x deleter.sh
to make it executable.
This script deletes all files passed to it as arguments, and then sleeps 0.5 seconds.
Then, you can run
find cache.bak -print0 | xargs -0 -n 5 deleter.sh
This command retrieves a list of all files in cache.bak and passes the five filenames at a time to the delete script.
So, you can adjust how many files are deleted at a time, and how long a delay is between each delete operation.
Solution 2:
You should consider saving your cache on a separate filesystem that you can mount/unmount as someone stated in comments. Until you do, you can use this one liner /usr/bin/find /path/to/files/ -type f -print0 -exec sleep 0.2 \; -exec echo \; -delete
assuming your find binary is located under /usr/bin and you want to see the progress on screen. Adjust the sleep accordingly, so you don't over stress your HDD.