Bash read ignores leading spaces

I have file a.txt with following content

    aaa
    bbb

When I execute following script:

while read line
do
    echo $line
done < a.txt > b.txt

generated b.txt contains following

aaa
bbb

It is seen that the leading spaces of lines have got removed. How can I preserve leading spaces?


Solution 1:

This is covered in the Bash FAQ entry on reading data line-by-line.

The read command modifies each line read; by default it removes all leading and trailing whitespace characters (spaces and tabs, or any whitespace characters present in IFS). If that is not desired, the IFS variable has to be cleared:

# Exact lines, no trimming
while IFS= read -r line; do
  printf '%s\n' "$line"
done < "$file"

As Charles Duffy correctly points out (and I'd missed by focusing on the IFS issue); if you want to see the spaces in your output you also need to quote the variable when you use it or the shell will, once again, drop the whitespace.

Notes about some of the other differences in that quoted snippet as compared to your original code.

The use of the -r argument to read is covered in a single sentence at the top of the previously linked page.

The -r option to read prevents backslash interpretation (usually used as a backslash newline pair, to continue over multiple lines). Without this option, any backslashes in the input will be discarded. You should almost always use the -r option with read.

As to using printf instead of echo there the behavior of echo is, somewhat unfortunately, not portably consistent across all environments and the differences can be awkward to deal with. printf on the other hand is consistent and can be used entirely robustly.

Solution 2:

There are several problems here:

  • Unless IFS is cleared, read strips leading and trailing whitespace.
  • echo $line string-splits and glob-expands the contents of $line, breaking it up into individual words, and passing those words as individual arguments to the echo command. Thus, even with IFS cleared at read time, echo $line would still discard leading and trailing whitespace, and change runs of whitespace between words into a single space character each. Additionally, a line containing only the character * would be expanded to contain a list of filenames.
  • echo "$line" is a significant improvement, but still won't correctly handle values such as -n, which it treats as an echo argument itself. printf '%s\n' "$line" would fix this fully.
  • read without -r treats backslashes as continuation characters rather than literal content, such that they won't be included in the values produced unless doubled-up to escape themselves.

Thus:

while IFS= read -r line; do
  printf '%s\n' "$line"
done