Use cases for having different process priority for CPU and IO?

The default behaviour of 'nice' is to adjust the 'io' priority of application also when the niceness changes.

Everything of course depends on your workload, but one of the key aspects of any operating system is how it allocates its resource, and how it handles contention.

Its actually important to understand what niceness does because when under load from competing processes the way the operating system behaves can have an impact on the rest of your workload.

Contention is the measure of how different applications compete for the same resource (like CPU).

Handling Load

Ever since the completely fair scheduler was introduced, nice is merely a frontend to the 'weight' clause of each process. Which can be viewed in proc.

$ cat /proc/self/sched
---------------------------------------------------------
...
se.load.weight                     :                 1024
...

Changing niceness merely alters the weight:

$ nice -n 5 cat /proc/self/sched 
---------------------------------------------------------
...
se.load.weight                     :                 335
...

The measure for CPU contention is done by the completely fair scheduling algorithm. Every application is assigned a 'weight' value and in the case of contending CPU time then time is split between processes by totalling all processing contending for CPU time and assigning them the lowest common denomination CPU time based off of their weight value.

If I have 3 applications all wanting to use CPU time, by default they receive 1024 as the normal weight. If I have one process with a nice +5 like above, all three weights would be totalled at 2383, the niced process would thus receive about 15% of cpu time in a given second if all 3 processes were asking for the CPU in that second.

Why is there a need to have different CPU and IO priority?

Niceness really only is playing with what to do when the system is under load, that is - how the operating system slices time up between competing processes as defined by whatever factors are necessary.

How this affects you or is relevent to you is bound by what delivery priories different applications have with one another and the time to deliver each application should have.

Niceness really only does something when your system is under load (there is more stuff wanting attention than the CPU or Disk can handle). It merely instructs the kernel how to allocate resource under those circumstances.

Is there any real world usage for having them different?

If you have numerous competing processes or work to be done that is more than can be done by the CPU, niceness gives you some relatively stable guarantees as to what work finishes first. This may be important to you if your say producing a report that should be delivered before another report finishes.

On a desktop system niceness can be even more important. Certain applications have a real-time behaviour whereby them being waked up more often during load prevents data going stale. Pulseaudio falls into this category for example.

Other applications might be required to dish out work for dependant applications. For example lots of apache requests to say a SQL server like MySQL may block for a long period of time because SQL isnt serving out fast enough because -- say some other report is competing for CPU time. So not only is SQL stalled but so is Apache. SQL can hurt here sometimes because there are usually far less worker threads than apache threads competing as a group to be weighed more favourably by the scheduler, so giving more CPU time to SQL evens things up.

UpdateDB (a program that indexes files) runs late on at night and is very disk heavy. It may be useful to reduce its IO scheduling priority so that other applications at that time get priority over something that is not as important in the order of things.

What real world use cases have you found that need different CPU and IO priority?

Very few. Niceness is too much of a best-effort approach. As a rule of thumb, I care less about how well applications perform and more on how badly they could perform. This might sound backwards at first but I have service delivery guarantees to meet which are more important to me.

I want the confidence to say "your stuff even on a bad day will be done in X time period". If it goes faster, its just a bonus.

I will typically start out by generating agreed specifications such as:

  • All web application is guaranteed to finish requests in 0.3 seconds.
  • All SQL requests on an system are guaranteed to be completed in 0.1 seconds.
  • The web application should handle no more than 50 IOPS and delivers 1k files.
  • The web applications memory footprint is no higher than 250Mb in total.

And draw out requirements to meet like:

  • All web requests should be completed in 0.05 seconds.
  • All SQL requests should be completed in 0.02 seconds.
  • There should be sufficient memory handle all requests.
  • IO requirements should be met.

Providing specifications are true, I then meet these goals without doing virtualization, using the vastly more efficient approach of control groups.

Control groups lets me make pretty reliable service level guarantees for resource allocation providing the application behaves within the boundaries specified. This means that even on a system under load I can guarantee resource availability for the application in question and guarantee space for other applications on the same box!

If we take your example of CPU and IO. I set up limits that meet those requirements:

# cd /sys/fs/cgroup/blkio/apache
# echo "253:0 100" >blkio.throttle.read_iops_device 
# echo "253:0 50" >blkio.throttle.write_iops_device 
# echo "253:0 102400" >blkio.throttle.read_bps_device

So 100k bytes to read of 100 iops.

# cd /sys/fs/cgroup/cpu/apache
# echo 1000000 >cpu.cfs_period_us
# echo 60000 >cpu.cfs_quota_us 

Of a 1 second time-period, give 0.06 seconds of CPU.

# cd /sys/fs/cgroup/cpu/sql
# echo 1000000 >cpu.cfs_period_us
# echo 20000 >cpu.cfs_quota_us

Of a 1 second time-period, give 0.02 seconds of CPU.

Providing other competing cgroups dont do anything silly, being under load is less of a factor in my service delivery because I know how the CPU is being thrown about for each application.

Control groups of this nature are still best effort but they offer far more control on that effort than niceness and ioniceness does.