Prevent duplicate script from running at the same time

I am using scrapy to fetch some resources, and I want to make it a cron job which can start every 30 minutes.

The cron job:

0,30 * * * * /home/us/jobs/`

cd ~/spiders/goods
export PATH
pkill -f $(pgrep | grep -v $$)
sleep 2s
scrapy crawl good

As the script shows I tried to kill the script process and the child process (scrapy) also.

However when I tried running two instances of the script, the newer instance does not kill the older one.

How to fix that?


I have more than one .sh scrapy script which run at different frequency configured in cron.

Update 2 - Test for Serg's answer:

All the cron jobs have been stopped before I run the test.

Then I open three terminal windows say they are named w1 w2 and w3, and run the commands in the following orders:

Run `pgrep scrapy` in w3, which print none.(means no scrapy running at the moment).

Run `./` in w1

Run `pgrep scrapy` in w3 which print one process id say it is `1234`(means scrapy have been started by the script)

Run `./` in w2 #check the w1 and found the script have been terminated.

Run `pgrep scrapy` in w3 which print two process id `1234` and `5678`

Press <kbd>Ctrl</kbd>+<kbd>C</kbd> in w2 (twice)

Run `pgrep scrapy` in w3 which print one process id `1234` (means scrapy of `5678` have been stopped)

At this moment, I have to use pkill scrapy to stop scrapy with id of 1234

Better approach would be to use a wrapper script, that will call the main script. This would look like this:

# This is /home/user/bin/ file
pkill -f ''
exec bash ./

Of course wrapper has to be named differently. That way, pkill can search only for your main script. This way your main script reduces to this:

cd /home/user/spiders/goods
export PATH
scrapy crawl good

Note that in my example I am using ./ because script was in my current working directory. Use full path to your script for best results

I have tested this approach with a simple main script that just runs infinite while loop and wrapper script. As you can see in screenshot, launching second instance of wrapper kills previous

enter image description here

Your script

This is just example. Remember that I have no access to scrapy to actually test this so adjust this as needed for your situation.

Your cron entry should look like this:

0,30 * * * * /home/us/jobs/

Contents of

pkill -f ''
exec sh /home/us/jobs/

Contents of

cd /home/user/spiders/goods
export PATH
# sleep delay now is not necessary
# but uncomment if you think it is
# sleep 2
scrapy crawl good

If I understand what you are doing correctly, you want to call a process every 30 minutes (via cron). However, of when you start a new process via cron, you want to kill any existing versions still running?

You could use the "timeout" command to ensure that if scrappy if forced to terminate if it is still running after 30 minutes.

This would make your script look like this:

cd ~/spiders/goods
export PATH
timeout 30m scrapy crawl good

note the timeout added in the last line

I have set the duration to "30m" (30 minutes). You might want to choose a slightly shorter time (say 29m) to ensure that the process has terminated before the next job starts.

Note that if you change the spawn interval in crontab, you will have to edit the script as well

Maybe you should monitor if script is running by creating parent shell script pid file and try to kill previous running parent shell script by checking pid file. Something like that


#Check if script pid file exists and kill process
if [ -f "$PIDFILE" ]
  PID=$(cat $PIDFILE)
  #Check if process id is valid
  ps -p $PID >/dev/null 2>&1
  if [ "$?" -eq "0" ]
    #If it is valid kill process id
    kill "$PID"
    #Wait for timeout
    sleep "$TIMEOUT"
    #Check if process is still running after timeout
    ps -p $PID >/dev/null 2>&1
    if [ "$?" -eq "0" ]
      echo "ERROR: Process is still running"
      exit 1

#Create PID file
echo $$ > $PIDFILE
if [ "$?" -ne "0" ]
  echo "ERROR: Could not create PID file"
  exit 1

export PATH
cd ~/spiders/goods
scrapy crawl good
#Delete PID file