grep from tar.gz without extracting [faster one]
Am trying to grep pattern from dozen files .tar.gz but its very slow
am using
tar -ztf file.tar.gz | while read FILENAME
do
if tar -zxf file.tar.gz "$FILENAME" -O | grep "string" > /dev/null
then
echo "$FILENAME contains string"
fi
done
Solution 1:
If you have zgrep
you can use
zgrep -a string file.tar.gz
Solution 2:
You can use the --to-command
option to pipe files to an arbitrary script. Using this you can process the archive in a single pass (and without a temporary file). See also this question, and the manual.
Armed with the above information, you could try something like:
$ tar xf file.tar.gz --to-command "awk '/bar/ { print ENVIRON[\"TAR_FILENAME\"]; exit }'"
bfe2/.bferc
bfe2/CHANGELOG
bfe2/README.bferc
Solution 3:
I know this question is 4 years old, but I have a couple different options:
Option 1: Using tar --to-command grep
The following line will look in example.tgz
for PATTERN
. This is similar to @Jester's example, but I couldn't get his pattern matching to work.
tar xzf example.tgz --to-command 'grep --label="$TAR_FILENAME" -H PATTERN ; true'
Option 2: Using tar -tzf
The second option is using tar -tzf
to list the files, then go through them with grep
. You can create a function to use it over and over:
targrep () {
for i in $(tar -tzf "$1"); do
results=$(tar -Oxzf "$1" "$i" | grep --label="$i" -H "$2")
echo "$results"
done
}
Usage:
targrep example.tar.gz "pattern"