Equivalent to tar's "--strip-components=1" in unzip?
I have a script that extracts a tar.gz-file to a specified subdirectory mysubfolder:
mkdir mysubfolder; tar --extract --file=sourcefile.tar.gz --strip-components=1 --directory=mysubfolder;
Is there any equivalent way of doing this with a zip-file?
As Mathias said, unzip
has no such option, but a one-liner bash script can do the job.
Problem is: the best approach depends on your archive layout. A solution that assumes a single top-level dir will fail miserably if the content is directly in the archive root (think about /a/foo
/b/foo
/foo
and the chaos of stripping /a
and /b
).
And the same fail happens with tar --strip-component
. There is no one-size-fits-all solution.
So, to strip the root dir, assuming there is one (and only one):
unzip -d "$dest" "$zip" && f=("$dest"/*) && mv "$dest"/*/* "$dest" && rmdir "${f[@]}"
Just make sure second-level files/dirs do not have the same name of the top-level parent (for example, /foo/foo
). But /foo/bar/foo
and /foo/bar/bar
are ok. If they do, or you just want to be safe, you can use a temp dir for extraction:
temp=$(mktemp -d) && unzip -d "$temp" "$zip" && mkdir -p "$dest" &&
mv "$temp"/*/* "$dest" && rmdir "$temp"/* "$temp"
If you're using Bash, you can test if top level is a single dir or not using:
f=("$temp"/*); (( ${#f[@]} == 1 )) && [[ -d "${f[0]}" ]] && echo "Single dir!"
Speaking of Bash, you should turn on dotglob
to include hidden files, and you can wrap everything in a single, handy function:
# unzip featuring an enhanced version of tar's --strip-components=1
# Usage: unzip-strip ARCHIVE [DESTDIR] [EXTRA_cp_OPTIONS]
# Derive DESTDIR to current dir and archive filename or toplevel dir
unzip-strip() (
set -eu
local archive=$1
local destdir=${2:-}
shift; shift || :
local tmpdir=$(mktemp -d)
trap 'rm -rf -- "$tmpdir"' EXIT
unzip -qd "$tmpdir" -- "$archive"
shopt -s dotglob
local files=("$tmpdir"/*) name i=1
if (( ${#files[@]} == 1 )) && [[ -d "${files[0]}" ]]; then
name=$(basename "${files[0]}")
files=("$tmpdir"/*/*)
else
name=$(basename "$archive"); name=${archive%.*}
files=("$tmpdir"/*)
fi
if [[ -z "$destdir" ]]; then
destdir=./"$name"
fi
while [[ -f "$destdir" ]]; do destdir=${destdir}-$((i++)); done
mkdir -p "$destdir"
cp -ar "$@" -t "$destdir" -- "${files[@]}"
)
Now put that in your ~/.bashrc
and you'll never have to worry about it again. Simply use as:
unzip-strip sourcefile.zip [mysubfolder] [OPTIONS]
This little beast will:
- Create
mysubfolder
for you if it does not exist - Automatically detect if if your zip archive contains a single top-level directory and handle the extraction accordingly.
-
mysubfolder
is optional. If blank it will extract to a subdir of the current directory (not necessarily the archive directory!), named after:- The single top-level directory in the archive, if there is one
- or archive file name, without the (presumably
.zip
) extension
- If destination path, given or derived, already exists as a file, increment the name until a suitable one is found (new path or existing directory).
- By default:
- be silent
- overwrite any existing files
- preserve links and attributes (mode, timestamps, etc)
- Pass extra
OPTIONS
tocp
. Useful options are:-
-v|--verbose
: output each copied file, just likeunzip
does -
-n|--no-clobber
: do not overwrite existing files -
-u|--update
: only overwrite files that are newer than the destination
-
- Require extra
OPTIONS
to be the 3rd argument onward. Craft a proper argument parser if needed. - Use double the extracted disk space during the operation, due to "extract to temp dir and copy" approach. No way around this without losing some of its flexibility/features.
You can use -j
to junk paths (do not make directories). This is only recommended for somewhat common single-level archives. Archives with multi level directory structures will be flattened - this might even lead to name clashes for the files to extract.
From the man page of unzip:
-j junk paths. The archive's directory structure is not recreated; all files are deposited in the extraction directory (by default, the current one).
I couldn’t find such an option in the manual pages for unzip
, so I’m afraid this is impossible. :(
However, (depending on the situation) you could work around it. For example, if you’re sure the only top-level directory in the zip file is named foo-
followed by a version number, you could do something like this:
cd /tmp
unzip /path/to/file.zip
cd foo-*
cp -r . /path/to/destination/folder