generate a Zip file from azure blob storage files

I have some files stored in my windows azure blob storage. I want to take these files, create a zip file and store them in a new folder. Then return the path to the zip file. Set permission to the zip file location so that my users can download the zip file to their local machines by clicking on the link

 https://mystorage.blob.core.windows.net/myfiles/2b5f8ea6-3dc2-4b77-abfe-4da832e02556/AppList/isjirleq/mydocs1.doc
 https://mystorage.blob.core.windows.net/myfiles/2b5f8ea6-3dc2-4b77-abfe-4da832e02556/tempo/xyz/mymusic.mp3
 https://mystorage.blob.core.windows.net/myfiles/2b5f8ea6-3dc2-4b77-abfe-4da832e02556/general/video/myVideo.wmv
 https://mystorage.blob.core.windows.net/myfiles/2b5f8ea6-3dc2-4b77-abfe-4da832e02556/photo/photo1.png

I want to be able to loop through these files and zip them all together to create a new zip file

(https://mystorage.blob.core.windows.net/myzippedfiles/allmyFiles.zip ) and return the path to the zip file

I have a large number of files in my azure blob. So downloading, zipping and uploading them is not a good idea.

How can I do this? I need some sample code to do this


Solution 1:

We have solved this problem (partially) by zipping the files directly to the output stream using the blob streams. This avoids the issue of downloading zipping then sending and avoids the delay while this happens (we used ICSharpZipLib, reference). But it still means routing the stream through the web server:

  public void ZipFilesToResponse(HttpResponseBase response, IEnumerable<Asset> files, string zipFileName)
    {
        using (var zipOutputStream = new ZipOutputStream(response.OutputStream))
        {
            zipOutputStream.SetLevel(0); // 0 - store only to 9 - means best compression
            response.BufferOutput = false;
            response.AddHeader("Content-Disposition", "attachment; filename=" + zipFileName);
            response.ContentType = "application/octet-stream";

            foreach (var file in files)
            {
                var entry = new ZipEntry(file.FilenameSlug())
                {
                    DateTime = DateTime.Now,
                    Size = file.Filesize
                };
                zipOutputStream.PutNextEntry(entry);
                storageService.ReadToStream(file, zipOutputStream);
                response.Flush();
                if (!response.IsClientConnected)
                {
                   break;
                }
            }
            zipOutputStream.Finish();
            zipOutputStream.Close();
        }
        response.End();
    }

The storage service simply does this:

public void ReadToStream(IFileIdentifier file, Stream stream, StorageType storageType = StorageType.Stored, ITenant overrideTenant = null)
    {
        var reference = GetBlobReference(file, storageType, overrideTenant);
        reference.DownloadToStream(stream);
    }
private CloudBlockBlob GetBlobReference(IFileIdentifier file, StorageType storageType = StorageType.Stored, ITenant overrideTenant = null)
        {
            var filepath = GetFilePath(file, storageType);
            var container = GetTenantContainer(overrideTenant);
            return container.GetBlockBlobReference(filepath);
        }

Solution 2:

Since blob storage is "just" an object store, you would need to download them somewhere (it could be a web/worker role or your local computer), zip them and then reupload the zip file. That's the only way to do it as far as I know.