Bash pipeline signal propagation - how does it work?
While answering this question, I was unable to fully explain how signals propagate through a pipeline.
Consider the following examples.
Using timeout
as the first element in the pipeline
This causes gpg
to bail out having caught the SIGTERM
that was delivered to cat
, by timeout
, leaving a broken file.
$ timeout 1 cat /dev/urandom | gpg -er [email protected] > ./myfile.gpg
gpg: Terminated caught ... exiting
Terminated
$ gpg -d < ./myfile.gpg > /dev/null
You need a passphrase to unlock the secret key for
user: "Attie Grande <[email protected]>"
4096-bit RSA key, ID C9AEA6AE, created 2016-12-13 (main key ID 7826F053)
gpg: encrypted with 4096-bit RSA key, ID C9AEA6AE, created 2016-12-13
"Attie Grande <[email protected]>"
gpg: block_filter 0x145e790: read error (size=14775,a->size=14775)
gpg: block_filter 0x145f110: read error (size=10710,a->size=10710)
gpg: WARNING: encrypted message has been manipulated!
gpg: block_filter: pending bytes!
gpg: block_filter: pending bytes!
Using timeout
in the middle of the pipeline
This works as expected - gpg
exits cleanly.
$ cat /dev/urandom | timeout 1 cat | gpg -er [email protected] > ./myfile.gpg
$ gpg -qd < ./myfile.gpg > /dev/null
You need a passphrase to unlock the secret key for
user: "Attie Grande <[email protected]>"
4096-bit RSA key, ID C9AEA6AE, created 2016-12-13 (main key ID 7826F053)
Using SIGUSR1
instead of SIGTERM
Again, this works as expected - gpg
exits cleanly. I expect because cat
quits on SIGUSR1
, while gpg
ignores it.
$ timeout -sUSR1 1 cat /dev/urandom | gpg -er [email protected] > ./myfile.gpg
$ gpg -qd < ./myfile.gpg > /dev/null
You need a passphrase to unlock the secret key for
user: "Attie Grande <[email protected]>"
4096-bit RSA key, ID C9AEA6AE, created 2016-12-13 (main key ID 7826F053)
Using process substitution
Again, this works - though I hadn't expected it to.
$ gpg -er [email protected] > ./myfile.gpg < <( timeout 1 cat /dev/urandom )
$ gpg -qd < ./myfile.gpg > /dev/null
You need a passphrase to unlock the secret key for
user: "Attie Grande <[email protected]>"
4096-bit RSA key, ID C9AEA6AE, created 2016-12-13 (main key ID 7826F053)
I can only presume that the signal of the first element in the pipeline is propagated through to the rest of the elements in the pipeline (even separating them with timeout cat | cat | gpg
fails).
I've had a look for documentation, and had a play with set -e
, set -o pipefail
but they didn't act as I was expecting.
- What is actually going on?
- What are the semantics?
- Do we have any control over this?
- Is there a better way than moving the signal-generating-process form the front of the pipeline?
Solution 1:
I can only presume that the signal of the first element in the pipeline is propagated through to the rest of the elements in the pipeline.
As far as I know there's no such propagation. I'm going to answer mainly your first question:
What is actually going on?
Short answer
(This may be somewhat simplified.)
- When running a pipe, interactive
bash
places every process in a process group withPGID
(process group ID) equal to thePID
(process ID) of the first command. -
timeout
changes its ownPGID
to its ownPID
. This changes nothing iftimeout
is the first command in the pipe. -
timeout
sends the signal not only to the underlying command but to its entire process group as well. Iftimeout
is the first command in the pipeline then its process group will still includegpg
, thereforegpg
will get the signal.
The phenomenon is researched and elaborated below.
Elaboration
1. bash
behavior
When running a pipe, interactive bash
places every process in a process group with PGID
equal to the PID
of the first command. You can make your own tests (see Is it possible to get process group ID from /proc?
). I haven't researched more complex possibilities (e.g. what if the first "command" is a subshell?), in your case they don't matter. What matters is that gpg
in these commands
timeout 1 cat /dev/urandom | gpg -er [email protected] > ./myfile.gpg
cat /dev/urandom | timeout 1 cat | gpg -er [email protected] > ./myfile.gpg
timeout -sUSR1 1 cat /dev/urandom | gpg -er [email protected] > ./myfile.gpg
gpg -er [email protected] > ./myfile.gpg < <( timeout 1 cat /dev/urandom )
gets PGID
equal to the PID
of
timeout
- (the first)
cat
timeout
-
gpg
(i.e. itself)
respectively.
2. timeout
changes its own PGID
(or not)
Run strace timeout 1 cat
and you will see among other things:
setpgid(0, 0)
An excerpt from man 2 setpgid
:
int setpgid(pid_t pid, pid_t pgid);
setpgid()
sets thePGID
of the process specified bypid
topgid
. Ifpid
is zero, then the process ID of the calling process is used. Ifpgid
is zero, then thePGID
of the process specified bypid
is made the same as its process ID.
This means timeout
sets its PGID
equal to its PID
. There are two possibilities:
- if
timeout
is the first command, itsPGID
is the same before and aftersetpgid
, sogpg
still has the samePGID
astimeout
; - if
timeout
is not the first command, itsPGID
is changed and even ifgpg
had initially the samePGID
astimeout
the twoPGID
s are different now.
3. timeout
sends more signals than you expected
The same strace timeout 1 cat
reveals lines like:
kill(19401, SIGTERM)
…
kill(0, SIGTERM)
In this example 19401
is the PID
of cat
. If you used -s USR1
then there will be SIGUSR1
instead of SIGTERM
etc. This second kill
is responsible for what you thought was a signal propagation through the pipeline. See man 2 kill
(excerpt):
int kill(pid_t pid, int sig);
If
pid
equals0
, thensig
is sent to every process in the process group of the calling process.
The calling process is timeout
. It sends signals to its entire process group. I admit I don't know what the purpose behind this is, still it does.
So if timeout
is the first command in the pipeline then the chosen signal will be sent to every part of it (well, almost; consider another timeout
in the same pipeline). This includes gpg
. Then it's up to gpg
how it reacts to the signal.
Other questions
Do we have any control over this? Is there a better way than moving the signal-generating-process from the front of the pipeline?
My quick search yielded no common tool to set/change PGID
. I think you can write your own program that will call setpgid(2)
or so; but now, when we know what is going on, moving timeout
from the front of the pipeline seems to be a quite sane approach.
Also note this is because of how timeout
behaves. Other signal-generating-processes may not need such workaround.