We're currently setting up a server to some heavy lifting (ETL) after another process has finished within the business, at the moment we're firing off jobs either via scheduled cron jobs or remote execution (via ssh). Early on this week we hit a issue with too many jobs running side by side on the system which brought all the jobs to a snail pace as they fought for CPU time.

I've been looking for a batch scheduler, a system where we can insert jobs into a run queue and the system will process them one by one. Can anyone advise on a program/system to do this with? Low cost / FOSS would be appreciated due to the shoe-string nature of this project.


I'd set up some kind of queueing service. A quick Google on "ready to use" stuff shows this:

  • http://sqs.sourceforge.net/

Depending on your needs you could simply

  • create a wrapper where users submit jobs,
  • the wrapper writes the job to a socket/file/whatever
  • create a consumer that runs job by job waiting for it to finish
  • the consumer is then called regularly by cron (every 5 minutes or so)
    • of course create some locking mechanism so that only n jobs run at a time (where n=>1)
  • if there are no more jobs do nothing
  • if there are more jobs grab the next and wait for it to finish

Actually there's more to it, you could have requirements that implement a priority queue which brings up problems like starving jobs or similiar but it's not that bad to get something up and running quite fast.

If LDP as suggested by womble I'd take that. Having such a system maintained by a larger community is of course better than creating your own bugs for problems others already solved :)

Also the queuing service has the advantage of decoupling the resources from the actual number crunching. By making the jobs available over some network connection you can simply throw hardware at a (possible) scaling problem and have nearly endless scalability.


Two solutions spring to mind:

  1. Use xargs -P to control the maximum parallel processes at one time.
  2. Create a Makefile and spawn with make -j.

They are actually both summarised in this SO thread in more detail.

There is a possibility that these may not be applicable to the structure of your scripting.


A heavy weight solution to your problem is to use a something like Sun Grid Engine.

Sun Grid Engine (SGE). SGE is a distributed resource management software and it allows the resources within the cluster/machine (cpu time,software, licenses etc) to be utilized effectively.

Here is a small tutorial on how to use SGE.