Progress Bar not available for zipfile? How to give feedback when program seems to hang

I am fairly new to C# and coding in general so some of this might be going about things the wrong way. The program I wrote works and compresses the file as expected, but if the source is rather large, the program appears (to Windows) to hang. I feel like I should be using a Thread but I am not sure that will help.

I would use a progress bar but the 'new' (.net 4.5) library for zipfile from System.IO.Compression which replaced Ionic.Zip.ZipFile does not have a method to report progress? Is there a way around this? Should I be using a Thread? or DoWork?

The trouble is that the user and the system is not getting feedback on what the program is doing.

I am not sure I am asking the question the right way. Below is the code that is working, but again, will appear to hang the system.

    private void beginBackup_Click(object sender, EventArgs e)
    {
        try
        {
            long timeTicks = DateTime.Now.Ticks;
            string zipName = "bak" + timeTicks + ".zip";
            MessageBox.Show("This Will take a bit, there is no status bar :(");
            ZipFile.CreateFromDirectory(Properties.Settings.Default.source,
                  Properties.Settings.Default.destination + "\\" + zipName);
            MessageBox.Show("Done!");
            this.Close();
        }
        catch (IOException err)
        {
            MessageBox.Show("Something went wrong" + System.Environment.NewLine
                + "IOException source: {0}", err.Source);
        }
    }

The important line being:

        `ZipFile.CreateFromDirectory(Properties.Settings.Default.source,
              Properties.Settings.Default.destination + "\\" + zipName);`

EDIT

ZipFile.CreateFromDirectory()is not walking the directory so there is nothing to increment? it would simply start and finish with no reporting. Unless I am mistaken?

using this method here:

        while (!completed)
    {
        // your code here to do something
        for (int i = 1; i <= 100; i++)
        {
            percentCompletedSoFar = i;
            var t = new Task(() => WriteToProgressFile(i));
            t.Start();
            await t;
            if (progress != null)
            {
                progress.Report(percentCompletedSoFar);
            }
            completed = i == 100;
        }
    }

the code in the for loop would only run once, as the Zipfile woudl still hang the program, then the progress bar would immediately go from 0 to 100?


I would use a progress bar but the 'new' (.net 4.5) library for zipfile from System.IO.Compression which replaced Ionic.Zip.ZipFile does not have a method to report progress? Is there a way around this? Should I be using a Thread? or DoWork?

You really have two issues here:

  1. The .NET version of the ZipFile class does not include progress reporting.
  2. The CreateFromDirectory() method blocks until the entire archive has been created.

I am not that familiar with the Ionic/DotNetZip library, but browsing the docs, I don't see any asynchronous methods for creating an archive from a directory. So #2 would be an issue regardless. The easiest way to solve it is to run the work in a background thread, e.g. using Task.Run().

As for the #1 issue, I would not characterize the .NET ZipFile class as having replaced the Ionic library. Yes, it's new. But .NET already had .zip archive support in previous versions. Just not a convenience class like ZipFile. And neither the earlier support for .zip archives nor ZipFile provide progress reporting "out-of-the-box". So neither really replace the Ionic DLL per se.

So IMHO, it seems to me that if you were using the Ionic DLL and it worked for you, the best solution is to just keep using it.

If you really don't want to use it, your options are limited. The .NET ZipFile just doesn't do what you want. There are some hacky things you could do, to work around the lack of feature. For writing an archive, you could estimate the compressed size, then monitor the file size as it's being written and compute an estimated progress based on that (i.e. poll the file size in a separate async task, every second or so). For extracting an archive, you could monitor the files being generated, and compute progress that way.

But at the end of the day, that sort of approach is far from ideal.

Another option is to monitor the progress by using the older ZipArchive-based features, writing the archive yourself explicitly and tracking the bytes as they are read from the source file. To do this, you can write a Stream implementation that wraps the real input stream, and which provides progress reporting as the bytes are read.

Here's a simple example of what that Stream might look like (note comment about this being for illustration purposes…it really would be better to delegate all the virtual methods, not just the two you're required to):

Note: in the course of looking for existing questions related to this one, I found one that is essentially a duplicate, except that it's asking for a VB.NET answer instead of C#. It also asked for progress updates while extracting from an archive, in addition to creating one. So I adapted my answer here, for VB.NET, adding the extraction method, and tweaking the implementation a little. I've updated the answer below to incorporate those changes.

StreamWithProgress.cs

class StreamWithProgress : Stream
{
    // NOTE: for illustration purposes. For production code, one would want to
    // override *all* of the virtual methods, delegating to the base _stream object,
    // to ensure performance optimizations in the base _stream object aren't
    // bypassed.

    private readonly Stream _stream;
    private readonly IProgress<int> _readProgress;
    private readonly IProgress<int> _writeProgress;

    public StreamWithProgress(Stream stream, IProgress<int> readProgress, IProgress<int> writeProgress)
    {
        _stream = stream;
        _readProgress = readProgress;
        _writeProgress = writeProgress;
    }

    public override bool CanRead { get { return _stream.CanRead; } }
    public override bool CanSeek {  get { return _stream.CanSeek; } }
    public override bool CanWrite {  get { return _stream.CanWrite; } }
    public override long Length {  get { return _stream.Length; } }
    public override long Position
    {
        get { return _stream.Position; }
        set { _stream.Position = value; }
    }

    public override void Flush() { _stream.Flush(); }
    public override long Seek(long offset, SeekOrigin origin) { return _stream.Seek(offset, origin); }
    public override void SetLength(long value) { _stream.SetLength(value); }

    public override int Read(byte[] buffer, int offset, int count)
    {
        int bytesRead = _stream.Read(buffer, offset, count);

        _readProgress?.Report(bytesRead);
        return bytesRead;
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        _stream.Write(buffer, offset, count);
        _writeProgress?.Report(count);
    }
}

With that in hand, it's relatively simple to handle the archive creation explicitly, using that Stream to monitor the progress:

ZipFileWithProgress.cs

static class ZipFileWithProgress
{
    public static void CreateFromDirectory(string sourceDirectoryName, string destinationArchiveFileName, IProgress<double> progress)
    {
        sourceDirectoryName = Path.GetFullPath(sourceDirectoryName);

        FileInfo[] sourceFiles =
            new DirectoryInfo(sourceDirectoryName).GetFiles("*", SearchOption.AllDirectories);
        double totalBytes = sourceFiles.Sum(f => f.Length);
        long currentBytes = 0;

        using (ZipArchive archive = ZipFile.Open(destinationArchiveFileName, ZipArchiveMode.Create))
        {
            foreach (FileInfo file in sourceFiles)
            {
                // NOTE: naive method to get sub-path from file name, relative to
                // input directory. Production code should be more robust than this.
                // Either use Path class or similar to parse directory separators and
                // reconstruct output file name, or change this entire method to be
                // recursive so that it can follow the sub-directories and include them
                // in the entry name as they are processed.
                string entryName = file.FullName.Substring(sourceDirectoryName.Length + 1);
                ZipArchiveEntry entry = archive.CreateEntry(entryName);

                entry.LastWriteTime = file.LastWriteTime;

                using (Stream inputStream = File.OpenRead(file.FullName))
                using (Stream outputStream = entry.Open())
                {
                    Stream progressStream = new StreamWithProgress(inputStream,
                        new BasicProgress<int>(i =>
                        {
                            currentBytes += i;
                            progress.Report(currentBytes / totalBytes);
                        }), null);

                    progressStream.CopyTo(outputStream);
                }
            }
        }
    }

    public static void ExtractToDirectory(string sourceArchiveFileName, string destinationDirectoryName, IProgress<double> progress)
    {
        using (ZipArchive archive = ZipFile.OpenRead(sourceArchiveFileName))
        {
            double totalBytes = archive.Entries.Sum(e => e.Length);
            long currentBytes = 0;

            foreach (ZipArchiveEntry entry in archive.Entries)
            {
                string fileName = Path.Combine(destinationDirectoryName, entry.FullName);

                Directory.CreateDirectory(Path.GetDirectoryName(fileName));
                using (Stream inputStream = entry.Open())
                using(Stream outputStream = File.OpenWrite(fileName))
                {
                    Stream progressStream = new StreamWithProgress(outputStream, null,
                        new BasicProgress<int>(i =>
                        {
                            currentBytes += i;
                            progress.Report(currentBytes / totalBytes);
                        }));

                    inputStream.CopyTo(progressStream);
                }

                File.SetLastWriteTime(fileName, entry.LastWriteTime.LocalDateTime);
            }
        }
    }
}

Notes:

  • This uses a class called BasicProgress<T> (see below). I tested the code in a console program, and the built-in Progress<T> class will use the thread pool to execute the ProgressChanged event handlers, which in turn can lead to out-of-order progress reports. The BasicProgress<T> simply calls the handler directly, avoiding that issue. In a GUI program using Progress<T>, the execution of the event handlers would be dispatched to the UI thread in order. IMHO, one should still use the synchronous BasicProgress<T> in a library, but the client code for a UI program would be fine using Progress<T> (indeed, that would probably be preferable, since it handles the cross-thread dispatching on your behalf there).
  • This tallies the sum of the file lengths before doing any work. Of course, this incurs a slight start-up cost. For some scenarios, it might be sufficient to just report total bytes processed, and let the client code worry about whether there's a need to do that initial tally or not.

BasicProgress.cs

class BasicProgress<T> : IProgress<T>
{
    private readonly Action<T> _handler;

    public BasicProgress(Action<T> handler)
    {
        _handler = handler;
    }

    void IProgress<T>.Report(T value)
    {
        _handler(value);
    }
}

And of course, a little program to test it all:

Program.cs

class Program
{
    static void Main(string[] args)
    {
        string sourceDirectory = args[0],
            archive = args[1],
            archiveDirectory = Path.GetDirectoryName(Path.GetFullPath(archive)),
            unpackDirectoryName = Guid.NewGuid().ToString();

        File.Delete(archive);
        ZipFileWithProgress.CreateFromDirectory(sourceDirectory, archive,
            new BasicProgress<double>(p => Console.WriteLine($"{p:P2} archiving complete")));

        ZipFileWithProgress.ExtractToDirectory(archive, unpackDirectoryName,
            new BasicProgress<double>(p => Console.WriteLine($"{p:P0} extracting complete")));
    }
}

I think the following is worth sharing, by zipping the files and not the folder, while retaining the relative paths of the files:

    void CompressFolder(string folder, string targetFilename)
    {
        string[] allFilesToZip = Directory.GetFiles(folder, "*.*", System.IO.SearchOption.AllDirectories);

        // You can use the size as the progress total size
        int size = allFilesToZip.Length;

        // You can use the progress to notify the current progress.
        int progress = 0;

        // To have relative paths in the zip.
        string pathToRemove = folder + "\\";

        using (ZipArchive zip = ZipFile.Open(targetFilename, ZipArchiveMode.Create))
        {
            // Go over all files and zip them.
            foreach (var file in allFilesToZip)
            {
                String fileRelativePath = file.Replace(pathToRemove, "");

                // It is not mentioned in MS documentation, but the name can be
                // a relative path with the file name, this will create a zip 
                // with folders and not only with files.
                zip.CreateEntryFromFile(file, fileRelativePath);
                progress++;

                // ---------------------------
                // TBD: Notify about progress.
                // ---------------------------
            }
        }
    }

Notes:

  • You can use FileInfo fileInfo = new FileInfo(file); with fileInfo.Length to progress, using the weight of the files, and not by the amount of files. Sometimes this is more realistic. For that you will also need to accumulate the total folder weight in advance.
  • This solution worked for me.
  • I did not notice any performance degradation between zipping the entire directory and zipping each file in the directory - I did not test this, though.