How to split a tar file into smaller parts at file boundaries?

There is a tool, tarsplitter which safely splits tar archives. You specify the number of parts you want to split the archive into, and it will figure out where the file boundaries are.

https://github.com/AQUAOSOTech/tarsplitter

The output smaller archives won't be exactly the same size, but pretty close - assuming the files in the original archive don't have a lot of variation.

Example - split the archive "files.tar" into 4 smaller archives:

tarsplitter -p 4 -i files.tar -o /tmp/parts

Creating:

/tmp/parts0.tar
/tmp/parts1.tar
/tmp/parts2.tar
/tmp/parts3.tar

If recreating the archive is an option this Bash script should do the trick (it's just a possible manner):

#!/bin/bash

if [ $# != 3 ] ; then
    echo -e "$0 in out max\n"
    echo -e "\tin:  input directory"
    echo -e "\tout: output directory"
    echo -e "\tmax: split size threshold in bytes"
    exit
fi

IN=$1 OUT=$2 MAX=$3 SEQ=0 TOT=0
find $IN -type f |
while read i ; do du -bs "$i" ; done |
sort -n |
while read SIZE NAME ; do
    if [ $TOT != 0 ] && [ $((TOT+SIZE)) -gt $MAX ] ; then
        SEQ=$((SEQ+1)) TOT=0
    fi
    TOT=$((TOT+SIZE))
    TAR=$OUT/$(printf '%08d' $SEQ).tar
    tar rf $TAR "$NAME"
done

It sorts (ascending order) all the files by size and starts creating the archives; it switches to another when the size exceeds the threshold.

NOTE: Make you sure that the output directory is empty.

USE AT YOUR OWN RISK


I don't believe there are any existing tools to do this, but it would be reasonably easy to implement yourself. The tar format is pretty simple, so you'd just have to have a split that took it into consideration. The basic theory is to read a header, look at the stated length of the incoming file, and determine whether to split now or write out the current file. Read the next header, and repeat.


The tarsplitter command offered by @ruffrey looks like an awesome option.
I downloaded it, then did:

brew install golang

to be able to compile it. (Hmm...is it already in Homebrew? Nope.) The command successfully compiled on my Mac on 10.14. I'm currently making a copy of my gigantic archive to run tarsplitter against it. Two thumbs up for the recommendation.

I'm a relative noob when it comes to compiling other people's code, so it would have been helpful if the author made it clear it was written in GO instead of C/C++ and needed a new compiler installed. Also, make install doesn't work as there's no install in the Makefile, so I just did:

cp build/tarsplitter_mac /usr/local/bin/tarsplitter

Neat that the GO compiler built for Mac, Linux, and Windows.