Celery: WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL)

The SIGKILL your worker received was initiated by another process. Your supervisord config looks fine, and the killasgroup would only affect a supervisor initiated kill (e.g. the ctl or a plugin) - and without that setting it would have sent the signal to the dispatcher anyway, not the child.

Most likely you have a memory leak and the OS's oomkiller is assassinating your process for bad behavior.

grep oom /var/log/messages. If you see messages, that's your problem.

If you don't find anything, try running the periodic process manually in a shell:

MyPeriodicTask().run()

And see what happens. I'd monitor system and process metrics from top in another terminal, if you don't have good instrumentation like cactus, ganglia, etc for this host.


This kind of error raises when your asynchronous task (through celery), or the script you are using is storing a lot of data (in the memory). It causes memory leak.

In my case, I was getting data from other system and saving it on a variable, so that I can export all data (into Django model / Excel file) after finishing the whole process.

Here is the catch. My script was gathering 10 Million data, when I was gathering data into my python's variable it was draining memory. Which raised the error.

To overcome the issue, I divided 10 Million data into 20 parts (half million on each part). I checked, when the length of data is half million I stored the data into the my own preferred local file / Django model. then do this for next half million and so on.

No need to do the exact number of partitions. It is an idea of solving complex problem by splitting into multiple subproblem and solve the subproblems one by one. :D