How does cron internally schedule jobs?

Solution 1:

A few crickets heard in this question. Good 'ol RTFC with some discrete event simulation papers and Wikipedia:

http://en.wikipedia.org/wiki/Cron#Multi-user_capability

The algorithm used by this cron is as follows:

  1. On start-up, look for a file named .crontab in the home directories of all account holders.
  2. For each crontab file found, determine the next time in the future that each command is to be run.
  3. Place those commands on the Franta-Maly event list with their corresponding time and their "five field" time specifier.
  4. Enter main loop:
    1. Examine the task entry at the head of the queue, compute how far in the future it is to be run.
    2. Sleep for that period of time.
    3. On awakening and after verifying the correct time, execute the task at the head of the queue (in background) with the privileges of the user who created it.
    4. Determine the next time in the future to run this command and place it back on the event list at that time

Solution 2:

I wrote a blog post describing it.
Quoting the relevant text from there:

  • We can have a finite thread-pool which will execute all the tasks by picking them up from a PriorityBlockingQueue (thread-safe heap) prioritized on job.nextExecutionTime().
  • Meaning that the top element of this heap will be always be the one that will fire the soonest.
  • We will be following the standard threadpool producer-consumer pattern.
  • We will have one thread which will be running in an infinite loop and submitting new jobs to the thread pool after consuming them from the queue. Lets call it QueueConsumerThread:
void goToSleep(job, jobQueue){
    jobQueue.push(job);
    sleep(job.nextExecutionTime() - getCurrentTime());
}

void executeJob(job, jobQueue){
    threadpool.submit(job); // async call
    if (job.isRecurring()) {
        job = job.copy().setNextExecutionTime(getCurrentTime() + job.getRecurringInterval());
        jobQueue.add(job);
    }
}

@Override
void run(){
    while(true)
    {
        job = jobQueue.pop()
        if(job.nextExecutionTime() > getCurrentTime()){
            // Nothing to do
            goToSleep(job, jobQueue)
        }
        else{
            executeJob(job, jobQueue)
        }
    }
}
  • There will be one more thread which will be monitoring the crontab file for any new job additions and will push them to the queue.
  • Lets call it QueueProducerThread:
@Override
void run()
{
    while(true)
    {
        newJob = getNewJobFromCrontabFile() // blocking call
        jobQueue.push(newJob)
    }
}
  • However, there is a problem with this:
    • Imagine that Thread1 is sleeping and will wake up after an hour.
    • Meanwhile a new task arrives which is supposed to run every minute.
    • This new task will not be able to start executing until an hour later.
  • To solve this problem, we can have ProducerThread wakeup ConsumerThread from its sleep forcefully whenever the new task has to run sooner than the front task in the queue:
@Override
void run()
{
    while(true)
    {
        newJob = getNewJobFromCrontabFile() // blocking call
        jobQueue.push(newJob)
        if(newJob == jobQueue.peek())
        {
            // The new job is the one that will be scheduled next.
            // So wakeup consumer thread so that it does not oversleep.
            consumerThread.interrupt()
        }
    }
}

Note that this might not be how cron is implemented internally. However, this is the most optimal solution that I can think of. It requires no polling and all threads sleep until they need to do any work.