Why is tar|tar so much faster than cp?
Solution 1:
Cp
does open-read-close-open-write-close in a loop over all files. So reading from one place and writing to another occur fully interleaved. Tar|tar
does reading and writing in separate processes, and in addition tar
uses multiple threads to read (and write) several files 'at once', effectively allowing the disk controller to fetch, buffer and store many blocks of data at once. All in all, tar
allows each component to work efficiently, while cp
breaks down the problem in disparate, inefficiently small chunks.
Solution 2:
Your edit goes in the good direction: cp
isn't necessarily slower than tar | tar
. Depends for example on the quantity and size of files. For big files a plain cp
is best, since it's a simple job of pushing data around. For lots of small files the logistics are different and tar
might do a better job. See for example this answer.