Can I have multiple threads running in parallel in a pod with a CPU limit set to 500 millicores?

Solution 1:

Yes, CPU limits in Kubernetes are implemented using the Linux CPU quota subsystem (at least on Linux, not sure on Windows). That system works by giving the cgroup a total count of timeslices it is allowed to run, and refilling that every few milliseconds. If a task (thread, process) is marked as runnable and the group it is in has available quota tokens then it will run just as it always would (and the bucket is decremented). If there are no tokens then it won't run and a timeslice_exceeded event is emitted.

Setting a limit of 500m means the token refill rate will average out to 0.5 seconds of runtime allowed for every 1 second of wall clock time. But if you have a million cores then you could use all those tokens in one jiffy if your tasks were all runnable.

Solution 2:

You certainly can run 2 threads in a container at once using 25% of the cputime each. However, as for whether these will be run exactly together, I'm not 100% sure.

Running the following tests do seem to indicate it can:

docker run -it  --cpus=".5" --cpuset-cpus="0,1" polinux/stress stress --cpu 2
%Cpu0  : 29.6 us,  0.0 sy,  0.0 ni, 70.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 29.6 us,  0.0 sy,  0.0 ni, 70.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st