How to schedule server jobs more intelligently than with cron?

Solution 1:

The problem isn't really with cron - it's with your job.

You will need to have your job interact with a lock of some description. The easiest way to do this is have it attempt to create a directory and if successful continue, if not exit. When your job has finished and exits it should remove the directory ready for the next run. Here's a script to illustrate.

#!/bin/bash

function cleanup {
    echo "Cleanup"
    rmdir /tmp/myjob.lck
}

mkdir /tmp/myjob.lck ||  exit 1
trap cleanup EXIT
echo 'Job Running'
sleep  60
exit 0

Run this in one terminal then before 60 seconds is up run it in another terminal it will exit with status 1. Once the first process exits you can run it from the second terminal ...

EDIT:

As I just learned about flock I thought I'd update this answer. flock(1) may be easier to use. In this case flock -n would seem appropriate e.g.

* * * * * /usr/bin/flock -n /tmp/myAppLock.lck /path/to/your/job   

Would run your job every minute but would fail if flock could not obtain a lock on the file.

Solution 2:

One way would be to have your reindex script create a lock file so that it can check to see if there's already an instance of the script running. You can also add in some exception handling to see if the search engine is up and running.

A more involved alternative would be to use some sort of task queuer like Resque and Resque-scheduler:

https://github.com/blog/542-introducing-resque

https://github.com/bvandenbos/resque-scheduler#readme

There's also Qu and Sidekiq:

https://github.com/bkeepers/qu

https://github.com/mperham/sidekiq

Yes, that's all Ruby orientated, but you can look for "things like resque" in the language of your choice.