Running background jobs In a clustered environment
I have an architecture question. In a clustered web app environment, I can think of three ways to deal with background jobs:
- have a dedicated machine run all the jobs, thus freeing the web servers from having to do so
- have each web server also run background jobs, using a mechanism to make sure no two machines kick off the same job
- have one of the web servers double up as jobs-runner
What's the preferred approach?
IANAExpert, but I would imagine that option 1 would be preferable. The reasoning behind this is a simple separation of concerns. If jobs have their own dedicated machine, you can manage growth better. If you use option 2, you'll have an job processing potential that doesn't match its requirements. While the resources used should be the same whether one machine or many are running the jobs, I imagine whatever queuing system you're using has some overhead. Also, if something goes wrong with the queue or the webserver, you won't bring the other down. You've silo'd each part of your application, so you can grow as necessary, not as your architecture demands.
Each option has pros and cons and to select the preferred way in any case is needed (imho) a bit more of information. For example, what sort of background jobs? This is a crucial question, because if, for example, are business process might be interesting take advantage of the already present cluster.
If are, for example, maintenance processes not directly related with business (or users needs) may be have more sense to have a separated hardware (or virtual).
In my experience, sometimes, all us are a bit reluctant of full use the cluster, but the clusters are on place to use them!