How many Context Switches is "normal" (as a function of CPU cores (or other))?

This depends very much on the type of application you run. If you've got applications which are very trigger-happy WRT syscalls you can expect to see high amounts of context switching. If most of your applications idle around and only wake up when there's stuff happening on a socket, you can expect to see low context switch rates.

System calls

System calls cause context switches by their very own nature. When a process does a system call, it basically tells the kernel to take over from it's current point in time and memory to do stuff the process isn't privileged to do, and return to the same spot when it's done.

When we look at the definition of the write(2) syscall from Linux, this becomes very clear:

NAME
       write - write to a file descriptor

SYNOPSIS
       #include 

       ssize_t write(int fd, const void *buf, size_t count);

DESCRIPTION
       write() writes up to count bytes from the buffer pointed buf to the file
       referred to by the file descriptor fd. [..]

RETURN VALUE
       On success, the  number of bytes written is returned (zero indicates
       nothing was written). On error, -1 is returned, and errno is set
       appropriately.
       [..]

This basically tells the kernel to take over operation from the process, move up to count bytes, starting from the memory address pointed at by *buf to file descriptor fd of the current process and then return back to the process and tell him how it went.

A nice example to show this is the dedicated game server for Valve Source based games, hlds. http://nopaste.narf.at/f1b22dbc9 shows one second worth of syscalls done by a single instance of a game server which had no players on it. This process takes about 3% CPU time on a Xeon X3220 (2.4Ghz), just to give you a feeling for how expensive this is.

Multi-Tasking

Another source of context switching might be processes which don't do syscalls, but need to get moved off a given CPU to make room for other processes.

A nice way to visualize this is cpuburn. cpuburn doesn't do any syscalls itself, it just iterates over it's own memory, so it shouldn't cause any context switching.

Take an idle machine, start vmstat and then run a burnMMX (or any different test from the cpuburn package) for every CPU core the system has. You should have full system utilization by then but hardly any increased context switching. Then try to start a few more processes. You'll see that the context switching rate increases as the processes begin to compete over CPU cores. The amount of switching depends on the processes/core ratio and the multitasking resolution of your kernel.

Further reading

linfo.org has a nice writeup on what context switches and system calls are. Wikipedia has generic information and a nice link collection on System calls.


my moderately loaded webserver sits at around 100-150 switches a second most of the time with peaks into the thousands.

High context switch rates are not themselves an issue, but they may point the way to a more significant problem.

edit: Context switches are a symptom, not a cause. What are you trying to run on the server? If you have a multiprocessor machine, you may want to try setting cpu affinity for your main server processes.

Alternatively if you are running X, try dropping down into console mode.

edit again: at 16k cs per second, each cpu is averaging two switches per millisecond - that is half to a sixth of the normal timeslice. Could he be running a lot of IO bound threads?

edit again post graphs: Certainly looks IO bound. is the system spending most of its time in SYS when the context switches are high?

edit once more: High iowait and system in that last graph - completely eclipsing the userspace. You have IO problems.
What FC card are you using?

edit: hmmm. any chance of getting some benchmarks going on your SAN access with bonnie++ or dbench during deadtime? I would be interested in seeing if they have similar results.

edit: Been thinking about this over the weekend and I've seen similar usage patters when bonnie is doing the "write a byte at a time" pass. That may explain the large amount of switching going on, as each write would require a separate syscall.


I'm more inclined to concern about the CPU occupancy rate of the system state. If it's close to 10% or higher, that means your OS is spending too much time doing the context switches.Although move some processes to another machine is much slower,it deserves to do so.


Things like this are why you should try and keep performance baselines for your servers. That way, you can compare things you notice all of a sudden with things you have recorded in the past.

That said, I have servers running (not very busy Oracle servers, mainly), which are steady around 2k with some 4k peaks. For my servers, that is normal, for other people's servers that might be way too low or too high.

How far can you go back in your data?

What kind of CPU information can you give us?