How does this code work to compress a file?

Let's break this command down.

  1. compress_size_bzip2=anything sets the value of shell variable called compress_size_bzip2 (just a name without special meaning) to anything that is written on the right hand of = sign.

  2. That anything in our case is $(command). This is a command substitution; value of this construct is equal to whatever command writes out in the output.

  3. The command is: bzip2 "$file" ; stat -c %s "$file.bz2", so actually two commands executed one after the other. The first command, bzip2 "$file" compresses the file whose name is taken from the shell variable file. The quotes are there in case the name contains spaces. Normally this command does not write anything. The second command, stat -c %s "$file.bz2", outputs the size of the file whose name is the value of shell variable file plus the extension .bz2.

So that size is the output of the whole command, and it is assigned to the variable compress_size_bzip2.

If you set the variable file to the filename you want to compress, for example file=myfile.txt, and then run the above line, two things will happen:

  1. the file myfile.txt will be compressed into myfile.txt.bz2
  2. the size of the file myfile.txt.bz2 will be assigned to variable compress_size_bzip2. You can display this value with the command echo $compress_size_bzip2.

  1. bzip2 "$file"
    

    This will run bzip2 on the filename saved in a variable "$file". bzip2 will compress the file to a new file named $file.bz2.

  2. stat -c %s "$file.bz2"
    

    This runs stat on the newly created compress $file.bz2:

    From man stat:

    stat - display file or file system status
    
        -c  --format=FORMAT
             use the specified FORMAT instead of the default
        %s    total size, in bytes
    

    So, this stat command will return the file size in bytes of the new file.

  3. $(some_command)
    

    This is called command substitution

    Bash performs the expansion by executing command in a subshell environment and replacing the command substitution with the standard output of the command.

    So, var=$(some_command) saves the output of some_command into a variable var.


In total:

compress_size_bzip2=$(bzip2 "$file" ; stat -c %s "$file.bz2")

This runs bzip2 and stat in a subshell. The output of the subshell is the size of the compressed file in bytes, which will be saved in a variable $compress_size_bzip2.


However, there is room for improvement:

You should combine the commands in the subshell with &&, so stat only runs when bzip2 was successful.

If you don't need a compressed file, you should tell bzip2 to compress to standard output with -c flag, and use wc -c to tell its size:

compress_size_bzip2=$(bzip2 -c "$file" | wc -c)