Compressing directory using shutil.make_archive() while preserving directory structure

Solution 1:

Using the terms in the documentation, you have specified a root_dir, but not a base_dir. Try specifying the base_dir like so:

shutil.make_archive('/home/code/test_dicoms',
                    'zip',
                    '/home/code/',
                    'test_dicoms')

To answer your second question, it depends upon the version of Python you are using. Starting from Python 3.4, ZIP64 extensions will be availble by default. Prior to Python 3.4, make_archive will not automatically create a file with ZIP64 extensions. If you are using an older version of Python and want ZIP64, you can invoke the underlying zipfile.ZipFile() directly.

If you choose to use zipfile.ZipFile() directly, bypassing shutil.make_archive(), here is an example:

import zipfile
import os

d = '/home/code/test_dicoms'

os.chdir(os.path.dirname(d))
with zipfile.ZipFile(d + '.zip',
                     "w",
                     zipfile.ZIP_DEFLATED,
                     allowZip64=True) as zf:
    for root, _, filenames in os.walk(os.path.basename(d)):
        for name in filenames:
            name = os.path.join(root, name)
            name = os.path.normpath(name)
            zf.write(name, name)

Reference:

  • https://docs.python.org/library/shutil.html#shutil.make_archive
  • https://docs.python.org/library/zipfile.html#zipfile-objects

Solution 2:

I have written a wrapper function myself because shutil.make_archive is too confusing to use.

Here it is http://www.seanbehan.com/how-to-use-python-shutil-make_archive-to-zip-up-a-directory-recursively-including-the-root-folder/

And just the code..

import os, shutil
def make_archive(source, destination):
        base = os.path.basename(destination)
        name = base.split('.')[0]
        format = base.split('.')[1]
        archive_from = os.path.dirname(source)
        archive_to = os.path.basename(source.strip(os.sep))
        shutil.make_archive(name, format, archive_from, archive_to)
        shutil.move('%s.%s'%(name,format), destination)

make_archive('/path/to/folder', '/path/to/folder.zip')

Solution 3:

There are basically 2 approaches to using shutil: you may try to understand the logic behind it or you may just use an example. I couldn't find an example here so I tried to create my own.

;TLDR. Run shutil.make_archive('dir1_arc', 'zip', root_dir='dir1') or shutil.make_archive('dir1_arc', 'zip', base_dir='dir1') or just shutil.make_archive('dir1_arc', 'zip', 'dir1') from temp.

Suppose you have ~/temp/dir1:

temp $ tree dir1
dir1
├── dir11
│   ├── file11
│   ├── file12
│   └── file13
├── dir1_arc.zip
├── file1
├── file2
└── file3

How can you create an archive of dir1? Set base_name='dir1_arc', format='zip'. Well you have a lot of options:

  • cd into dir1 and run shutil.make_archive(base_name=base_name, format=format); it will create an archive dir1_arc.zip inside dir1; the only problem you'll get a strange behavior: inside your archive you'll find file dir1_arc.zip;
  • from temp run shutil.make_archive(base_name=base_name, format=format, base_dir='dir1'); you'll get dir1_arc.zip inside temp that you can unzip into dir1; root_dir defaults to temp;
  • from ~ run shutil.make_archive(base_name=base_name, format=format, root_dir='temp', base_dir='dir1'); you'll again get your file but this time inside ~ directory;
  • create another directory temp2 in ~ and run inside it: shutil.make_archive(base_name=base_name, format=format, root_dir='../temp', base_dir='dir1'); you'll get your archive in this temp2 folder;

Can you run shutil without specifying arguments? You can. Run from temp shutil.make_archive('dir1_arc', 'zip', 'dir1'). This is the same as run shutil.make_archive('dir1_arc', 'zip', root_dir='dir1'). What can we say about base_dir in this case? From documentation not so much. From source code we may see that:

if root_dir is not None:
  os.chdir(root_dir)

if base_dir is None:
        base_dir = os.curdir 

So in our case base_dir is dir1. And we can keep asking questions.