How does Linux determine the next PID?
The kernel allocates PIDs in the range of (RESERVED_PIDS, PID_MAX_DEFAULT). It does so sequentially in each namespace (tasks in different namespaces can have the same IDs). In case the range is exhausted, pid assignment wraps around.
Some relevant code:
Inside alloc_pid(...)
for (i = ns->level; i >= 0; i--) {
nr = alloc_pidmap(tmp);
if (nr < 0)
goto out_free;
pid->numbers[i].nr = nr;
pid->numbers[i].ns = tmp;
tmp = tmp->parent;
}
alloc_pidmap()
static int alloc_pidmap(struct pid_namespace *pid_ns)
{
int i, offset, max_scan, pid, last = pid_ns->last_pid;
struct pidmap *map;
pid = last + 1;
if (pid >= pid_max)
pid = RESERVED_PIDS;
/* and later on... */
pid_ns->last_pid = pid;
return pid;
}
Do note that PIDs in the context of the kernel are more than just int
identifiers; the relevant structure can be found in /include/linux/pid.h
. Besides the id, it contains a list of tasks with that id, a reference counter and a hashed list node for fast access.
The reason for PIDs not appearing sequential in user space is because kernel scheduling might fork a process in between your process' fork()
calls. It's very common, in fact.
I would rather assume the behavior you watch stems from another source:
Good web servers usually have several process instances to balance the load of the requests. These processes are managed in a pool and assigned to a certain request each time a request comes in. To optimize performance Apache probably assigns the same process to a bunch of sequential requests from the same client. After a certain amount of requests that process is terminated and a new one is created.
I don't believe that more than one processes in sequence are assigned the same PID by linux.
As you say that the new PID is gonna be close to the last one, I guess Linux simply assigns each process the last PID + 1. But there are processes popping up and being terminated all the time in background by applications and system programs, thus you cannot predict the exact number of the apache process being started next.
Apart from this, you should not use any assumption about PID assignment as a base for something you implement. (See also sanmai's comment.)
PIDs are sequential on most systems. You can see that by starting several processes by yourself on idle machine.
e.g. use up-arrow history recall to repeatedly run a command that prints its own PID:
$ ls -l /proc/self
lrwxrwxrwx 1 root root 0 Mar 15 19:32 /proc/self -> 21491
$ ls -l /proc/self
lrwxrwxrwx 1 root root 0 Mar 15 19:32 /proc/self -> 21492
$ ls -l /proc/self
lrwxrwxrwx 1 root root 0 Mar 15 19:32 /proc/self -> 21493
$ ls -l /proc/self
lrwxrwxrwx 1 root root 0 Mar 15 19:32 /proc/self -> 21494
Do not depend on this: for security reasons, some people run kernels that spend extra CPU time to randomly choose new PIDs.