How can I stop an inlined background process when my script stops?
Solution 1:
A somewhat general approach.
while true; do foo; sleep 1; done &
# the rest of the script here
kill -- -"$$"
The trick is the script runs child processes (here foo
among others) with Process Group ID (PGID) equal to the PID of the shell. This propagates to grandchildren and so on. The shell itself is in this process group as well. There are exceptions (jobs in interactive shells, timeout
) so this is not as general as you may want, still with foo
being ssh
or similar simple command in a non-interactive script the approach should work.
kill
with a negative argument sends signals to the entire process group.
One caveat though: a possible race condition. In general foo
may get killed before the subshell receives and handles the signal. If the delay is long enough (for whatever reason), a new foo
may be spawned (especially if without sleep 1
) after kill
does its job. Consider this improvement:
while true; do foo; sleep 1; done &
subpid=$!
# the rest of the script here
kill "$subpid"
wait "$subpid" 2>/dev/null
# at this moment we're certain the subshell is no more, new foo will not be spawned
trap '' TERM
# foo will maintain the old PGID, so…
kill -- -"$$" 2>/dev/null
The trap is here only to make the main shell exit gracefully without printing Terminated
to the console.
Not a general approach for any background process, yet usually a useful method for ssh
in similar scenario.
Use autossh
. From its manual:
autossh
is a program to start a copy ofssh
and monitor it, restarting it as necessary should it die or stop passing traffic.[…]
autossh
tries to distinguish the manner of death of thessh
process it is monitoring and act appropriately. The rules are:
- If the
ssh
process exited normally (for example, someone typedexit
in an interactive session),autossh
exits rather than restarting;- If
autossh
itself receives aSIGTERM
,SIGINT
, or aSIGKILL
signal, it assumes that it was deliberately signalled, and exits after killing the childssh
process;- […]
- […]
- If the child
ssh
process dies for any other reason,autossh
will attempt to start a new one.
Therefore:
autossh … &
apid=$!
# the rest of the script here
kill "$apid"
Note you won't be notified if the tunnel cannot be established in the first place. Since this is a possible flaw in your original approach as well, I'm not addressing this problem here.