Why Does a Shell Script Trapping SIGTERM Work When Run Manually, But Not When Run via launchd?

The ultimate problem here is that Bash does not normally kill its non-builtin children.

If bash is waiting for a command to complete and receives a signal for which a
trap has been set, the trap will not be executed until the command completes.
When bash is waiting for an asynchronous command  via  the  wait  builtin, the
reception of a signal for which a trap has been set will cause the wait builtin
to return immediately with an exit status greater than 128, immediately after
which  the trap is executed.

When you hit <CTRL>+<C> you're killing the shell script, which behaves normally -- but the sleep lives on. Use ps to see.

When try to stop things externally, via kill, then Bash as above. After some time-out period (I'm guessing 20 seconds) launchd then issues a kill -9 which the script cannot trap.

The solution is to issue a wait after the sleep, to indicate to Bash that it can interrupt itself:

sleep 86400 & wait

This will allow the script to be interrupted, but the sleep will still survive. I'm sure there's a way to kill the children, but I didn't bother looking it up...

Realizing you've just shared with us a essentially a code fragment and it's not clear what more you're daemon is looking to actually achieve other than to perform some action every so many seconds. So I'm going to make some assumptions just based on what you've written.

It seems like you're using the lockfile to prevent duplicate launch.
It then seems that you need the trap to clean up the lock file used to implement your test to assure singularity.
Additionally it appears that your deamon is doing a sleep loop to wake periodically and perform some action. (Just sleep more, in your example.)

These are all issues that launchd is meant to resolve in better ways under Darwin (and hence OS X).

As for the question(s) with the unload and SIGTERM, specifically, when you unload your launchdeamon is sent a SIGKILL instead of a SIGTERM. If you just wanted to stop the job or send it a SIGTERM then use stop instead of unload.

If you want a SIGTERM sent on unload you may need to set EnableTransactions. Likewise if you have cleanup tasks and you want your deamon to received signals for cleanup and SIGTERM then you should set EnableTransactions as part of the launchd plist for your script. <key>EnableTransactions</key><true/>. This is described in the docs at https://developer.apple.com/library/mac/documentation/Darwin/Reference/Manpages/man5/launchd.plist.5.html

But the three mechanisms above are unnecessary given launchd...

Under Darwin / OS X using launchdaemons the appropriate method for implementing a sleep loop daemon is to use StartInterval to run on an interval or StartCalendarInterval to run based at specific times. Using StartCalendarInterval additionally gives the advantage that when the system is asleep it will execute a missed interval time instead of having to wait for the next interval, and is generally what you want in these situations. If you have a job you just want to stay invoked, also consider using KeepAlive as part of the plist.

So it looks like -- from the code sample you've provided -- you just want to execute something every 86400 seconds. If this is the case then launchd has a mechanism for doing this that you should be using instead and obviates the need for your lock file and trap altogether as launchd is designed to handle all this for you automagically. That mechanism is StartInterval and when defined it will launch your deamon every N seconds. Launchd also makes sure it hasn't launched multiple copies of your daemon.

This mechanism is described in the launchd docs at https://developer.apple.com/library/mac/documentation/Darwin/Reference/Manpages/man5/launchd.plist.5.html where it states:

StartInterval <integer>
This optional key causes the job to be started every N seconds.  If the system is
asleep, the job will be started the next time the computer wakes up.  If multiple
intervals transpire before the computer is woken, those events will be coalesced 
into one event upon wake from sleep.

So your Darwin-ized script ~/Downloads/Example.sh would look something very simply now like this:

#!/bin/sh
echo $(date +%R)' Running…' # or whatever it is you wanted to do on the interval

And your plist would look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>org.example</string>
    <key>ProgramArguments</key>
    <array>
        <string>sh</string>
        <string>~/Downloads/Example.sh</string>
    </array>
    <key>EnableGlobbing</key>
    <true/>
    <key>StartInterval</key>
    <integer>86400</integer>
    <key>StandardOutPath</key>
    <string>/mypathtolog/myjob.log</string>
    <key>StandardErrorPath</key>
    <string>/mypathtolog/myjob.log</string>
</dict>
</plist>

Note I've also adjusted this to set the logging files here in a Darwin/launchd like manner rather than in the script itself. (You could of course remove them and handle them in your script but it's not necessary given launchd.)

I'd note that you could also implement this using Program like so:

<key>Program</key>
<string>sh</string>
<key>ProgramArguments</key>
<array>
    <string>~/Downloads/Example.sh</string>
</array>

You may also find http://launchd.info a useful reference as well along with the Apple docs for how launchd operates at https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/Introduction.html

Information about daemons run periodically can be found at https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/ScheduledJobs.html#//apple_ref/doc/uid/10000172i-CH1-SW2

Why Does a Shell Script Trapping SIGTERM Work When Run Manually, But Not When Run via launchd?

Related

Recent Posts