Using `find` to delete

So, given three options...

  1. find .... -delete
  2. find .... | xargs rm ...
  3. find .... -exec rm ...;

..or variations thereof, which option is preferable?
I'm guessing there is no hard and fast answer, and a specific situation will dictate the best option (please name them!)

Cheers.


Solution 1:

Option 1 will avoid spawning external processes, which is useful under stressed conditions.

Option 2 will spawn a single xargs process, which will spawn only as many rm processes as necessary. This option is typically used with -print0 and -0 in order to handle filenames with spaces and/or newlines.

Option 3 will spawn a rm process for each file.

GNU find (or any POSIX-compliant version of find) allows a fourth option, find .... -exec rm -r {} +, which will run rm with as many filenames as possible in order to spawn only a limited number of them.

Solution 2:

I prefer to use find ... > file.txt review the file extensively, then use find ... -delete so I know the exact same results will be deleted (passing arguments is mostly bulletproof, mostly).

Solution 3:

The topic of deleting files is addressed in the section "Cleaning Up" in the GNU findutils documentation. You can read that on your system using "info find", or within Emacs. You can also view it online at http://www.gnu.org/software/findutils/manual/html_node/find_html/Cleaning-Up.html#Cleaning-Up.

find .... -delete

This is the most secure (against symbolic link races) and high-performance (since there is no need to exec anything or perform a context switch when the pipe buffer is full) option. But bear in mind that -delete implies -depth.

find .... | xargs rm ...

This is dangerous in situations where others have write access to the tree you're doing cleanup in. For example, supposing the find command decided that /var/tmp/scratch/me/.ssh/config matches its requirements and therefore prints that names to stdout. The xargs command will read that and add it to a data structure. A short time later (when xargs has read the number of bytes indicated by the default value of the -s option) xargs will fork and exec rm to delete it. However, it's possible that in the meantime, someone else has done this:

$ cd /var/tmp/scratch
$ mv me me.old
$ ln -s /root me

Then, when rm goes to delete /var/tmp/scratch/me/.ssh/config it will issue the system call unlink("/var/tmp/scratch/me/.ssh/config"). Because the kernel will resolve the symbolic link for you, this is equivalent to it calling unlink("/root/.ssh/config"). If the xargs process was running as root, then /root/.ssh/config will get deleted, despite the fact that you didn't specify -L on the command line. For this reason if security is important, use -delete. You can read more about this area in the "Security Considerations" section of the GNU find manual.

find .... -exec rm ...;

Because this also involves fork/exec, it has the same security issues I mention above.

In short, the only reason not to use -delete is compatibility with systems which lack support for -delete.