Why piping input to "read" only works when fed into "while read ..." construct? [duplicate]
I've been trying to read input into environment variables from program output like this:
echo first second | read A B ; echo $A-$B
And the result is:
-
Both A and B are always empty. I read about bash executing piped commands in sub-shell and that basically preventing one from piping input to read. However, the following:
echo first second | while read A B ; do echo $A-$B ; done
Seems to work, the result is:
first-second
Can someone please explain what is the logic here? Is it that the commands inside the while
... done
construct are actually executed in the same shell as echo
and not in a sub-shell?
Solution 1:
How to do a loop against stdin and get result stored in a variable
Under bash (and other shell also), when you pipe something to another command via |
, you will implicitly create a fork, a subshell that is a child of current session. The subshell can't affect current session's environment.
So this:
TOTAL=0
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 |
while read A B;do
((TOTAL+=A-B))
printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
done
echo final total: $TOTAL
won't give expected result! :
9 - 4 = 5 -> TOTAL= 5
3 - 1 = 2 -> TOTAL= 7
77 - 2 = 75 -> TOTAL= 82
25 - 12 = 13 -> TOTAL= 95
226 - 664 = -438 -> TOTAL= -343
echo final total: $TOTAL
final total: 0
Where computed TOTAL could'nt be reused in main script.
Inverting the fork
By using bash Process Substitution, Here Documents or Here Strings, you could inverse the fork:
Here strings
read A B <<<"first second"
echo $A
first
echo $B
second
Here Documents
while read A B;do
echo $A-$B
C=$A-$B
done << eodoc
first second
third fourth
eodoc
first-second
third-fourth
outside of the loop:
echo : $C
: third-fourth
Here Commands
TOTAL=0
while read A B;do
((TOTAL+=A-B))
printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
done < <(
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664
)
9 - 4 = 5 -> TOTAL= 5
3 - 1 = 2 -> TOTAL= 7
77 - 2 = 75 -> TOTAL= 82
25 - 12 = 13 -> TOTAL= 95
226 - 664 = -438 -> TOTAL= -343
# and finally out of loop:
echo $TOTAL
-343
Now you could use $TOTAL
in your main script.
Piping to a command list
But for working only against stdin, you may create a kind of script into the fork:
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 | {
TOTAL=0
while read A B;do
((TOTAL+=A-B))
printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
done
echo "Out of the loop total:" $TOTAL
}
Will give:
9 - 4 = 5 -> TOTAL= 5
3 - 1 = 2 -> TOTAL= 7
77 - 2 = 75 -> TOTAL= 82
25 - 12 = 13 -> TOTAL= 95
226 - 664 = -438 -> TOTAL= -343
Out of the loop total: -343
Note: $TOTAL
could not be used in main script (after last right curly bracket }
).
Using lastpipe bash option
As @CharlesDuffy correctly pointed out, there is a bash option used to change this behaviour. But for this, we have to first disable job control:
shopt -s lastpipe # Set *lastpipe* option
set +m # Disabling job control
TOTAL=0
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 |
while read A B;do
((TOTAL+=A-B))
printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
done
9 - 4 = 5 -> TOTAL= -338
3 - 1 = 2 -> TOTAL= -336
77 - 2 = 75 -> TOTAL= -261
25 - 12 = 13 -> TOTAL= -248
226 - 664 = -438 -> TOTAL= -686
echo final total: $TOTAL
-343
This will work, but I (personally) don't like this because this is not standard and won't help to make script readable. Also disabling job control seem expensive for accessing this behaviour.
Note: Job control is enabled by default only in interactive sessions. So set +m
is not required in normal scripts.
So forgotten set +m
in a script would create different behaviours if run in a console or if run in a script. This will not going to make this easy to understand or to debug...
Solution 2:
a much cleaner work-around...
first="firstvalue"
second="secondvalue"
read -r a b < <(echo "$first $second")
echo "$a $b"
This way, read isn't executed in a sub-shell (which would clear the variables as soon as that sub-shell has ended). Instead, the variables you want to use are echoed in a sub-shell that automatically inherits the variables from the parent shell.
Solution 3:
First, this pipe-chain is executed:
echo first second | read A B
then
echo $A-$B
Because the read A B
is executed in a subshell, A and B are lost.
If you do this:
echo first second | (read A B ; echo $A-$B)
then both read A B
and echo $A-$B
are executed in the same subshell (see manpage of bash, search for (list)
Solution 4:
What you are seeing is the separation between processes: the read
occurs in a subshell - a separate process which cannot alter the variables in the main process (where echo
commands later occur).
A pipeline (like A | B
) implicitly places each component in a sub-shell (a separate process), even for built-ins (like read
) that usually run in the context of the shell (in the same process).
The difference in the case of "piping into while" is an illusion. The same rule applies there: the loop is the second half of a pipeline, so it's is in a subshell, but the whole loop is in the same subshell, so the separation of processes does not apply.