Background Worker with Flask

I have a webapp that's built on python/Flask and it has a corresponding background job that runs continuously, periodically polling for data for each registered user.

I would like this background job to start when the system starts and keep running til it shuts down. Instead of setting up /etc/rc.d scripts, I just had the flask app spawn a new process (using the multiprocessing module) when the app starts up.

So with this setup, I only have to deploy the Flask app and that will get the background worker running as well.

What are the downsides of this? Is this a complete and utter hack that is fragile in some way or a nice way to set up a webapp with corresponding background task?


The downside of your approach is that there are many ways it could fail especially around stopping and restarting your flask application.

  • You will have to deal with graceful shutdown to give your worker a chance to finish its current task.
  • Sometime your worker won't stop on time and might linger while you start another one when you reboot your flask application.

Here are some approches I would suggest depending on your constraints:

script + crontab

You only have to write a script that does whatever task you want and cron will take care of running it for you every few minutes. Advantages: cron will run it for you periodically and will start when the system starts. Disadvantages: if the task takes too long, you might have multiple instances of your script running at the same time. You can find some solutions for this problem here.

supervisord

supervisord is a neat way to deal with different daemons. You can set it to run your app, your background script or both and have them start with the server. Only downside is that you have to install supervisord and make sure its daemon is running when the server starts.

uwsgi

uwsgi is a very common way for deploying flask applications. It has few features that might come in handy for managing background workers.

Celery

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. I think this is the best solution for scheduling background tasks for a flask application or any other python based application. But using it comes with some extra bulk. You will be introducing at least the following processes: - a broker (rabbitmq or redis) - a worker - a scheduler

You can also get supervisord to manage all of the processes above and get them to start when the server starts.

Conclusion

In your quest of reducing the number of processes, I would highly suggest the crontab based solution as it can get you a long way. But please make sure your background script leaves an execution trace or logs of some sort.