Getting notified of failed cron jobs
I'm concerned that cron jobs can fail silently for an indefinite period of time on vanilla Ubuntu Desktop 12.04.1 (Precise), and no one will notice. I would like to get a notification whenever a system cron job prints some output or just fails.
I know it is possible to install a mail server (e.g. postfix), configure it for local-only delivery, set up an alias so that root's mail is delivered to my normal user account and configure a mail client to check my local mailbox.
Are there any lightweight alternatives to this solution in Ubuntu?
Solution 1:
You could redirect the error output of your cronjob command to a file. Here is an example of a line in /etc/crontab
:
01 3 * * * user /bin/command 2>> /var/log/some.file
Then at least you got a clue if errors occured. You might even write a script to notify you over notify-osd or similar tools when the file changes.
Edit:
The file /var/log/syslog
reports messages from cron as well. You might want to take a look at that. To get a dedicated log-file for the cron deamon edit /etc/rsyslog.d/50-default.conf
and uncomment/edit the line that says:
#cron.* /var/log/cron.log
Don't know what you'll find there but worth the try. Report how it went.
Solution 2:
"I would like to get a notification whenever a system cron job prints some output or just fails."
I'd recommend using some sort of cron monitoring tool. There are a few out there but I currently use Dead Man's Snitch (https://deadmanssnitch.com) and like it. It will alert you when a cron job doesn't check in. Like you're doing, just curl your unique snitch URL after the job and hit the url. There are a few others out there like probyapp but they aren't free...good luck.
Solution 3:
Jenkins is a fairly helpful tool for this sort of thing. I know people tend to think of it as a CI, but it's really just a job execution tool. It will capture the output, run time and exit code status of a job which it uses to determine whether or not the job is a failure.
You can setup ssh-agent to have Jenkins connect to a remote machine, schedule jobs to run just like crons, automatically space them based on run time (instead of "3am"), chain dependencies, rotate logs of output on a per-job basis and integrate with a bunch of outside systems (Slack, Hipchat, etc) via plugins to be notified of failures.
Last time I set this up, it was a huge help. We knew immediately if there were issues anywhere and were able to centrally control and track crons from multiple different systems.