Unzipping files that are flying in through a pipe

Can I make unzip or any similar programs work on the standard output? The situation is I'm downloading a zip file, which is supposed to be unzipped on the fly.

Related issue: How do I pipe a downloaded file to standard output in bash?


Solution 1:

While a zip file is in fact a container format, there's no reason why it can't be read from a pipe (stdin) if the file can fit into memory easily enough. Here's a Python script that takes a zip file as standard input and extracts the contents to the current directory or to a specified directory if specified.

import zipfile
import sys
import StringIO
data = StringIO.StringIO(sys.stdin.read())
z = zipfile.ZipFile(data)
dest = sys.argv[1] if len(sys.argv) == 2 else '.'
z.extractall(dest)

This script can be minified to one line and created as an alias.

alias unzip-stdin="python -c \"import zipfile,sys,StringIO;zipfile.ZipFile(StringIO.StringIO(sys.stdin.read())).extractall(sys.argv[1] if len(sys.argv) == 2 else '.')\""

Now unzip the output of wget easily.

wget http://your.domain.com/your/file.zip -O - | unzip-stdin target_dir

Solution 2:

This is unlikely to work how you expect. Zip is not just a compression format, but also a container format. It rolls up the jobs of both tar and gzip.bzip2 into one. Having said that, if your zip has a single file, you can use unzip -p to extract the files to stdout. If you have more than one file, there's no way for you to tell where they start and stop.

As for reading from stdin, the unzip man page has this sentence:

Archives read from standard input are not yet supported, except with funzip (and then only the first member of the archive can be extracted).

You might have some luck with funzip.

Solution 3:

I like to use curl because it is installed by default (the -L is needed for redirects which often occur):

curl -L http://example.com/file.zip | bsdtar -xvf - -C /path/to/directory/

However, bsdtar is not installed by default, and I could not get funzip to work.