C# Remove all empty subdirectories

I have a task to clean up a large number of directories. I want to start at a directory and delete any sub-directories (no matter how deep) that contain no files (files will never be deleted, only directories). The starting directory will then be deleted if it contains no files or subdirectories. I was hoping someone could point me to some existing code for this rather than having to reinvent the wheel. I will be doing this using C#.


Solution 1:

Using C# Code.

static void Main(string[] args)
{
    processDirectory(@"c:\temp");
}

private static void processDirectory(string startLocation)
{
    foreach (var directory in Directory.GetDirectories(startLocation))
    {
        processDirectory(directory);
        if (Directory.GetFiles(directory).Length == 0 && 
            Directory.GetDirectories(directory).Length == 0)
        {
            Directory.Delete(directory, false);
        }
    }
}

Solution 2:

If you can target the .NET 4.0 you can use the new methods on the Directory class to enumerate the directories in order to not pay a performance penalty in listing every file in a directory when you just want to know if there is at least one.

The methods are:

  • Directory.EnumerateDirectories
  • Directory.EnumerateFiles
  • Directory.EnumerateFileSystemEntries

A possible implementation using recursion:

static void Main(string[] args)
{
    DeleteEmptyDirs("Start");
}

static void DeleteEmptyDirs(string dir)
{
    if (String.IsNullOrEmpty(dir))
        throw new ArgumentException(
            "Starting directory is a null reference or an empty string", 
            "dir");

    try
    {
        foreach (var d in Directory.EnumerateDirectories(dir))
        {
            DeleteEmptyDirs(d);
        }

        var entries = Directory.EnumerateFileSystemEntries(dir);

        if (!entries.Any())
        {
            try
            {
                Directory.Delete(dir);
            }
            catch (UnauthorizedAccessException) { }
            catch (DirectoryNotFoundException) { }
        }
    }
    catch (UnauthorizedAccessException) { }
}

You also mention that the directory tree could be very deep so it's possible you might get some exceptions if the path you are probing are too long.

Solution 3:

Running the test on C:\Windows 1000 times on the 3 methods mentioned so far yielded this:

GetFiles+GetDirectories:630ms
GetFileSystemEntries:295ms
EnumerateFileSystemEntries.Any:71ms

Running it on an empty folder yielded this (1000 times again):

GetFiles+GetDirectories:131ms
GetFileSystemEntries:66ms
EnumerateFileSystemEntries.Any:64ms

So EnumerateFileSystemEntries is by far the best overall when you are checking for empty folders.

Solution 4:

Here's a version that takes advantage of parallel execution to get it done faster in some cases:

public static void DeleteEmptySubdirectories(string parentDirectory){
  System.Threading.Tasks.Parallel.ForEach(System.IO.Directory.GetDirectories(parentDirectory), directory => {
    DeleteEmptySubdirectories(directory);
    if(!System.IO.Directory.EnumerateFileSystemEntries(directory).Any()) System.IO.Directory.Delete(directory, false);
  });   
}

Here's the same code in single threaded mode:

public static void DeleteEmptySubdirectoriesSingleThread(string parentDirectory){
  foreach(string directory in System.IO.Directory.GetDirectories(parentDirectory)){
    DeleteEmptySubdirectories(directory);
    if(!System.IO.Directory.EnumerateFileSystemEntries(directory).Any()) System.IO.Directory.Delete(directory, false);
  }
}

... and here's some sample code you could use to test results in your scenario:

var stopWatch = new System.Diagnostics.Stopwatch();
for(int i = 0; i < 100; i++) {
  stopWatch.Restart();
  DeleteEmptySubdirectories(rootPath);
  stopWatch.Stop();
  StatusOutputStream.WriteLine("Parallel: "+stopWatch.ElapsedMilliseconds);
  stopWatch.Restart();
  DeleteEmptySubdirectoriesSingleThread(rootPath);
  stopWatch.Stop();
  StatusOutputStream.WriteLine("Single: "+stopWatch.ElapsedMilliseconds);
}

... and here're some results from my machine for a directory that is on a file share across a wide area network. This share currently has only 16 subfolders and 2277 files.

Parallel: 1479
Single: 4724
Parallel: 1691
Single: 5603
Parallel: 1540
Single: 4959
Parallel: 1592
Single: 4792
Parallel: 1671
Single: 4849
Parallel: 1485
Single: 4389

Solution 5:

From here, Powershell script to remove empty directories:

$items = Get-ChildItem -Recurse

foreach($item in $items)
{
      if( $item.PSIsContainer )
      {
            $subitems = Get-ChildItem -Recurse -Path $item.FullName
            if($subitems -eq $null)
            {
                  "Remove item: " + $item.FullName
                  Remove-Item $item.FullName
            }
            $subitems = $null
      }
}

Note: use at own risk!