Using “while read…”,echo and printf get different outcomes

Solution 1:

It's not just echo vs printf

First, let's understand what happens with read a b c part. read will perform word-splitting based on the default value of IFS variable which is space-tab-newline, and fit everything based on that. If there's more input than the variables to hold it, it will fit splitted parts into first variables, and what can't be fitted - will go into last. Here's what I mean:

bash-4.3$ read a b c <<< "one two three four"
bash-4.3$ echo $a
one
bash-4.3$ echo $b
two
bash-4.3$ echo $c
three four

This is exactly how it is described in bash's manual (see the quote at the end of the answer).

In your case what happens is that, 1 and 2 fit into a and b variables, and c takes everything else, which is 3 4 5 6.

What you also will see a lot of times is that people use while IFS= read -r line; do ... ; done < input.txt to read text files line by line. Again, IFS= is here for a reason to control word-splitting, or more specifically - disable it, and read a single line of text into a variable. If it wasn't there, read would be trying to fit each individual word into line variable. But that's another story, which I encourage you to study later, since while IFS= read -r variable is a very frequently used structure.

echo vs printf behavior

echo does what you'd expect here. It displays your variables exactly as read has arranged them. This has been already demonstrated in previous discussion.

printf is very special, because it will keep on fitting variables into format string until all of them are exhausted. So when you do printf "%d, %d, %d \n" $a $b $c printf sees format string with 3 decimals, but there's more arguments than 3 (because your variables actually expand to individual 1,2,3,4,5,6). This may sound confusing, but exists for a reason as improved behavior from what the real printf() function does in C language.

What you also did here that affects the output is that your variables are not quoted, which allows the shell ( not printf ) to break down variables into 6 separate items. Compare this with quoting:

bash-4.3$ read a b c <<< "1 2 3 4"
bash-4.3$ printf "%d %d %d\n" "$a" "$b" "$c"
bash: printf: 3 4: invalid number
1 2 3

Exactly because $c variable is quoted, it is now recognized as one whole string, 3 4, and it doesn't fit the %d format, which is just a single integer

Now do the same without quoting:

bash-4.3$ printf "%d %d %d\n" $a $b $c
1 2 3
4 0 0

printf again says: "OK, you have 6 items there but format shows only 3, so I'll keep fitting stuff and leaving blank whatever I cannot match to actual input from user".

And in all these cases you don't have to take my word for it. Just run strace -e trace=execve and see for yourself what does the command actually "see":

bash-4.3$ strace -e trace=execve printf "%d %d %d\n" $a $b $c
execve("/usr/bin/printf", ["printf", "%d %d %d\\n", "1", "2", "3", "4"], [/* 80 vars */]) = 0
1 2 3
4 0 0
+++ exited with 0 +++

bash-4.3$ strace -e trace=execve printf "%d %d %d\n" "$a" "$b" "$c"
execve("/usr/bin/printf", ["printf", "%d %d %d\\n", "1", "2", "3 4"], [/* 80 vars */]) = 0
1 2 printf: ‘3 4’: value not completely converted
3
+++ exited with 1 +++

Additional notes

As Charles Duffy properly pointed out in the comments,bash has its own built-in printf, which is what you're using in your command, strace will actually call /usr/bin/printf version, not shell's version. Aside from minor differences, for our interest in this particular question the standard format specifiers are the same and behavior is the same.

What also should be kept in mind is that printf syntax is far more portable ( and therefore preferred ) than echo, not to mention that the syntax is more familiar to C or any C-like language that has printf() function in it. See this excellent answer by terdon on the subject of printf vs echo. While you can make the output tailored to your specific shell on your specific version of Ubuntu, if you are going to be porting scripts across different systems, you probably should prefer printf rather than echo. Maybe you're a beginner system administrator working with Ubuntu and CentOS machines, or maybe even FreeBSD - who knows - so in such cases you will have to make choices.

Quote from bash manual, SHELL BUILTIN COMMANDS section

read [-ers] [-a aname] [-d delim] [-i text] [-n nchars] [-N nchars] [-p prompt] [-t timeout] [-u fd] [name ...]

One line is read from the standard input, or from the file descriptor fd supplied as an argument to the -u option, and the first word is assigned to the first name, the second word to the second name, and so on, with leftover words and their intervening separa‐ tors assigned to the last name. If there are fewer words read from the input stream than names, the remaining names are assigned empty values. The characters in IFS are used to split the line into words using the same rules the shell uses for expansion (described above under Word Splitting).

Solution 2:

This is only a suggestion and not meant to replace Sergiy's answer at all. I think Sergiy wrote a great answer as to why they are different in printing. How the variable on the read gets assigned with the remaining into the $c variable as 3 4 5 6 after 1 and 2 are assigned to a and b. echo won't split up the variable for you where printf will with the %ds.

You can, however, make them basically give you the same answers by manipulating the echo of the numbers at the beginning of the command:

In /bin/bash you can use:

echo -e "1 2 3 \n4 5 6"

In /bin/sh you can just use:

echo "1 2 3 \n4 5 6"

Bash uses the -e to enable the \ escape characters where sh does not need it as it is already enabled. \n causes it to create a new line, so now the echo line is split into two separate lines that can now be used two times for your echo loop statement:

:~$ echo -e "1 2 3 \n4 5 6" | while read a b c; do echo "$a, $b, $c"; done
1, 2, 3
4, 5, 6

Which in turn produces the same output with using the printf command:

:~$ echo -e "1 2 3 \n4 5 6" | while read a b c ;do printf "%d, %d, %d \n" $a $b $c; done
1, 2, 3 
4, 5, 6 

In sh

$ echo "1 2 3 \n4 5 6" | while read a b c; do echo "$a, $b, $c"; done
1, 2, 3
4, 5, 6
$ echo "1 2 3 \n4 5 6" | while read a b c ;do printf "%d, %d, %d \n" $a $b $c; done
1, 2, 3 
4, 5, 6 

Hope this helps!