Apache MPM workers stuck in G (gracefully finishing) growing - "scoreboard is full"

Solution 1:

When you use MPM worker, requests are handled by threads that exist in processes.

From https://httpd.apache.org/docs/2.4/mod/worker.html

A single control process (the parent) is responsible for launching child processes. Each child process creates a fixed number of server threads as specified in the ThreadsPerChild directive, as well as a listener thread which listens for connections and passes them to a server thread for processing when they arrive.

On Linux, a process 'contains' threads, that is one PID can have multiple threads which share memory (amongst other resources) with other threads in that PID.

As a matter of fact, Linux really only cares about 'tasks', a non-multi-threaded process is a PID with a container of one task.

When you gracefully reload Apache, you're terminating the containing process. What is happening here is Apache is making each thread wait until all the threads in the containing process have completed prior to restarting the container PID.

So, in you're case, you've got a single thread contained in all the processes in that list that is still busy or stuck somehow.

You've got a few options.

  1. Just give up waiting anyway and restart.
  2. Find the problem thread (might be a bug in the application) and fix it.

1, is easy. Add the configuration option GracefulShutdownTimeout with a value that is high but not stupid. Say 900 seconds. By default this is infinite which means your threads wait forever for your problem thread to finish.

The main downside to this is you run into a chance of hitting a process in the middle of doing something critical -- which terminating might in turn corrupt a file or break the application subtley. You also run a (vanishingly small) chance of terminating a client half-way through processing.

2, will involve you spotting the thread that is stuck in the list of workers and then diagnosing what the connection is doing, but you're bound to find what could be a design flaw and you can account for the behaviour more confidently before just blowing away a problem thread.