How make a compressed tar when there are too many filenames for the shell to expand on a single line?
Normally I would just do something like:
tar -czf archive.tar.gz *.csv
But when there are too many files in the directory for the shell to expand on a single line this doesn't work.
In these cases I would normally resort to using find
. Something like:
find /path -name '*.csv' -exec tar -rf "./archive.tar.gz" {} +;`
But this only seems to work if I don't include the -z
option because you can't append to compressed archives, and using -c
instead of -r
will overwrite the first archive since find runs tar multiple times.
The only other solution I could come up with is to create a .tar file with find
(as above) and then use a second command to compress it. Is there a better way to handle cases like this?
I'm using Ubuntu Linux.
Solution 1:
As a robust solution, use find
to separate filenames by a null character, and then pipe directly to tar
, which reads null-delimited input:
find . -maxdepth 1 -name '*.csv' -print0 |
tar -czf archive.tgz --null -T -
This will now handle all file names correctly and is not limited by the number of files either.
Using ls
to generate a list of filenames to be parsed by another program is a common antipattern that should be avoided whenever possible. find
can generate null-delimited output (-print0
) that most utilities can read or parse further. Since the null character is the only character that cannot appear in a filename (and the /
, obviously), you'll always be safe with that.
Solution 2:
No, you cannot append to a compressed tar file without uncompressing it first.
However, tar can accept its list of files to process from a file, so you can just do:
ls *.csv > temp.txt
tar -zcf ball.tgz -T temp.txt
@slhck points out that the above solution will not work if there are spaces (and probably other annoying characters) in your filenames. This version encloses each filename in double quotes:
ls *.csv | sed -e 's/^\(.*\)$/"\1"/' > temp.txt
tar -zcf ball.tgz -T temp.txt
(This will of course break if you have double quotes in your filenames, in which case you get what you deserve. :)