Is there a quick way to get the very last file in a large TAR?
Let's assume I have a several gigabyte tar file, but I also happen to know that the very last file written to the archive is something important that I need. Since tar files are appended sequentially, is there a way I can make tar read into the archive from the end to find this file, instead of starting from the beginning and reading over gigabytes of irrelevant data?
No, unfortunately there is not. From Wikipedia
Another weakness of the tar format compared to other archive formats is that there is no centralized location for the information about the contents of the file (a "table of contents" of sorts). So to list the names of the files that are in the archive, one must read through the entire archive and look for places where files start. Also, to extract one small file from the archive, instead of being able to lookup the offset in a table and go directly to that location, like other archive formats, with tar, one has to read through the entire archive, looking for the place where the desired file starts. For large tar archives, this causes a big performance penalty, making tar archives unsuitable for situations that often require random access of individual files.
Yes; if you know the size of the file you want, you can copy the end of the tar with dd skip. or if you want to read the whole file once for later quick random access you can make an index with:
tar -tRvf "$TAR"
An example script:
#!/bin/bash
#
# tar_extract_via_index.sh
#
TAR="$1"
RE="$2"
if [ ! -f "$TAR" ] ; then
echo "Not a file $TAR"
exit 1
fi
if [ "$RE" == "" ] ; then
echo "Expecting a $RE"
exit 2
fi
if [ ! -f "$TAR".index ] ; then
tar -tRvf "$TAR" > "$TAR".index
fi
MATCH="$(grep -P "$RE" "$TAR".index)"
if [ "$(echo "$MATCH" | grep -c .)" != "1" ] ; then
echo "Multipule matches:"
echo "$MATCH" | perl -pe 's/^/\t/g' >&2
exit 3
fi
FILE="$( echo "$MATCH" | perl -pe 's/.* \.\///g;s/.*\///g')"
SKIP="$( echo "$MATCH" | perl -pe 's/:.*//g;s/.* //g')"
COUNT="$(echo "$MATCH" | perl -pe 's/\.\/.*//g;s/.*\/[^ ]+ +//g;s/ .*//g')"
SKIP="$(echo "($SKIP+1)*512" | bc)"
dd if="$TAR" bs=1 status=none skip=$SKIP count=$COUNT of="$FILE"
echo "$FILE"