Using nice scheduling priority with tar/gzip cron script

Solution 1:

There are caveats to pay attention to. Since the question doesn't specify an exact OS (but implies it is some Unix like OS), the list of caveats will depend on specific OS and version. The most important to keep in mind are:

nice is intended to affect how much CPU time is given to a process, but not how much RAM or I/O capacity. Thus instead of the intended effect other possible outcomes include:

  • The backup takes longer time to complete due to being given less CPU time. But it will use just as much RAM as it used to and now it will use that RAM for longer time. The system is slowed down due to having less RAM for other purposes, and this slowness will last longer time than it used to.
  • The use of nice has no effect at all, because the backup process was I/O bound to begin with, and the I/O scheduling is unaffected by nice. If the OS happens to be a recent Linux version, the I/O scheduling may or may not be affected by nice depending on which ionice setting is being used.

Moreover even the exact effect on CPU scheduling depend a lot on the specific operating system and settings. Some kernels have settings which will allow a process to run at a higher or lower priority than those reachable by using the nice command.

One caveat that I have run into myself appears to be specific to Ubuntu 14.04. In the default configuration it groups processes for scheduling purposes. Each group then receives a fair share of CPU time. nice only affects how CPU time is allocated to processes within such a group, but not how much is allocated to each group. For me that completely undermined the use of nice, because a low priority process could still take away CPU time from processes in different groups.

Solution 2:

I'd take a different approach...

No, I wouldn't mess around with nice for this. And gzip isn't that great. Plus, you're using gzip -9 which gives the greatest compression rates at the expense of CPU. Do you really need that level of compression over the default (level 6)?

Does your system get strained as much if you don't use gzip level 9?

What are the specifications of your server? How many and what type of CPUs do you have? cat /proc/cpuinfo

If you have multiple CPUs, would you consider using pigz instead? It's multithreaded, a bit more efficient and can leverage the resources on your system much better.


Some tests with a 1.8GB file:

Standard gzip (-6 compression level)

Original file size: 1.8G    CHL0001.TXT 
Compression time: 0m18.335s
Compressed file size: 85M   CHL0001.TXT.gz
Decompression time: 0m6.300s

gzip -9 (highest compression)

Original file size: 1.8G    CHL0001.TXT
Compression time: 1m29.432s
Compressed file size: 75M   CHL0001.TXT.gz
Decompression time: 0m6.325s

pigz (-6 compression level)

Original file size: 1.8G    CHL0001.TXT
Compression time: 0m1.878s
Compressed file size: 85M   CHL0001.TXT.gz
Decompression time: 0m2.506s

pigz -9 (highest compression, multithreaded)

Original file size: 1.8G    CHL0001.TXT
Compression time: 0m5.611s
Compressed file size: 76M   CHL0001.TXT.gz
Decompression time: 0m2.489s

Conclusion: Is the extra bit of compression worth the vastly longer time spent compressing the data?

Solution 3:

I realize that this is straying from the original question, but it is staying on the theme of efficiency (you mention "huge strains on my server")...

I'm inferring (or guessing!) from what you've posted that you are creating a tar containing a set of files and then gzip-ing the result. You could save yourself a lot of disk I/O (and temporary space requirement) by piping one directly into the other:

tar cf - /path/to/stuff | gzip > archive.tar.gz

You may find that makes a significant difference to the total elapsed time.