IBM GPFS : very slow to remove files recursively

To delete files recursively in our IBM GPFS cluster, we use simple unix command like :

rm /my/directories -fr

However deletions are very long to be done.

Problem is that our distributed apps (Spark-based) took like one hour to be done. But then, it also took about an other hour to drop temporary files generated by distributed apps like Spark.

So global workloads are very inefficient. May be it's because the rm command has to list every sub-directories..

Anyway, do you known ways to efficiently drop an entire directory (and subdirectories) with GPFS ?

May be IBM give a special command to do that ?


I don’t think you can speed up this process as “rm” triggers lots of the metadata updates for the distributed file systems, and they take quite some time to complete. What you can try is to issue “mv” to some temp folder within the same file system (!!!) and do an actual “rm” in the background.