I'm trying to reduce the latency of my linux network application. I've learned that there are two tools for "binding" a program to particular CPU core: taskset and cpuset.

  1. Which one should I prefer? Are they equivalent on a lower level?
  2. (the disposition) My application has single thread and is supposed to process single tcp connection (no reconnect) over fast LAN network with the least possible latency. Am I on the right way?

Solution 1:

Taskset is for binding a process to one or more CPUs; essentially specifying where it can run at initial execution or while it's running. If using RHEL/CentOS on modern server equipment, numactl is recommended over taskset.

Cpuset/cset is for CPU shielding and is a framework build around Linux cgroups. Cset was never popular on certain distributions (like RHEL) because there are other tools available for process management.

The first command below creates a shield that would limit the operating system's tasks to CPU cores 0 and 8. The second would move your current shell session to the CPU shield specified, resulting in an isolation of system and user processes.

# cset shield --cpu 1-7,9-15 --kthread=on
# cset proc --move --pid=$$ --threads --toset=user

There are other things to possibly check for and tune before you go down the path of binding processes to CPUs; interrupts (irqbalance partial disablement), power-saving settings, system scheduler, I/O elevators, realtime policy (chrt).

See: Low latency TCP settings on Ubuntu

Here's a (convoluted) example of an application wrapper that selects a core, stops irqbalance, starts it and blacklists the selected core, then executes ./your_program with SCHED_FIFO and priority 99 on the selected core.

Core=5
CoreMask=`echo "16 o 2 $Core ^ p" | dc`
service irqbalance stop
  until [ "`service irqbalance status`" = "irqbalance is stopped" ] ; do sleep 1 ; done
IRQBALANCE_ONESHOT=1 IRQBALANCE_BANNED_CPUS=${CoreMask} irqbalance
sleep 1
  until [ "`service irqbalance status`" = "irqbalance is stopped" ] ; do sleep 1 ; done
numactl --physcpubind=${Core} --localalloc chrt -f 99 ./your_program