cron and "command not found"

I'm setting up a cron job to run an executable bash script which contains a pypdfocr command. Whenever I manually execute the script everything works as expected, if instead I use cron with this schedule:

* 6 * * * cd /path/to/ && ./executable

I get this error:

pypdfocr: command not found

Given this, in the bash script I've tried to give the full path to pypdfocr, i.e.:

/anaconda/bin/pypdfocr

But now I have:

/bin/sh: pdfimages: command not found
/bin/sh: gs: command not found

Any idea on how I can fix this?


Solution 1:

When cron runs an event, it uses the default shell environment of the running UID. However, no "profile" customization is applied, i.e. your .bash_profile is not sourced and thus any PATH settings are not picked up. As well, I don't believe that the common profiles are picked up either. As such, you probably have no PATH or LD_LIBRARY_PATH environment settings available to the process you're trying to launch and this is why pdfimages and gs isn't being picked up by default.

In the past, I've solved this one of two ways:

  1. Directly reference the full path of the file I need.
  2. Create a wrapper shell script for the job.

I typically prefer the 2nd one since it not only allows me to set up an environment for the job to run in, but it also makes it relatively easy to add debug situations easily. For example, if the job isn't working right I can just edit the shell script and put in STDOUT redirection to a debug file.

So in short, I would have a cron entry of

* 6 * * * cd /path/to/ && ./executable.sh

.. which would change to the path, but the executable.sh would do all the export PATH, export LD_LIBRARY_PATH, etc. to get my job set up.

Your sample executable.sh could be as simple as this:

#!/bin/bash

# if you want to just pick up your profile, you can '.' source it
. ~/.bash_profile

export PATH=/where/i/find/gs
export LD_LIBRARY_PATH=/if/i/need/libs

(./executable 2&>1) >executable.out

The executable.out file redirection isn't necessary since without it all STDOUT goes to cron.out, but it does make it a bit cleaner to do it this way. Also the 2>&1 nonsense with the parenthesis makes sure that both STDERR and STDOUT make it into the output file; this helps with debugging why a job didn't run.