When setting IFS to split on newlines, why is it necessary to include a backspace?

I'm curious as to why the backspace is necessary when setting IFS to split on newlines like this:

IFS=$(echo -en "\n\b")

Why can I not just use this (which doesn't work) instead?

IFS=$(echo -en "\n")

I'm on a Linux system that is saving files with Unix line endings. I've converted my file with newlines to hex and it definitely only uses "0a" as the newline character.

I've googled a lot and although many pages document the newline followed by backspace solution, none that I have found explain why the backspace is required.

-David.


Solution 1:

Because as bash manual says regarding command substitution:

Bash performs the expansion by executing command and replacing the command substitution with the standard output of the command, with any trailing newlines deleted.

So, by adding \b you prevent removal of \n.

A cleaner way to do this could be to use $'' quoting, like this:

IFS=$'\n'

Solution 2:

I just remembered the easiest way. Tested with bash on debian wheezy.

IFS="
"

no kidding :)

Solution 3:

It's a hack because of the use of echo and command substitution.

prompt> x=$(echo -en "\n")
prompt> echo ${#x}
0
prompt> x=$(echo -en "\n\b")
prompt> echo ${#x}
2

The $() strips trailing newlines and \b prevents \n from being a trailing newline while being highly unlikely to appear in any text. IFS=$'\n' is the better way to set IFS to split on newlines.

Solution 4:

The \b char as a suffix of newline \n is added due to removal of the tailing \n in the command substitution $(...). So the \b is used as a suffix to \n, and by that \n is no longer trailing, so it is returned from the the command substitution.

The side effect is, that IFS also will include the \b char as a separator as well, instead of just \n, which really is our sole interest.

If you expect \b may someday appear in the string (why not?), then you may use:

IFS="$(printf '\nx')" && IFS="${IFS%x}";

that returns \n suffixed with x && removes x

now IFS contains only the \n char.

IFS="$(printf '\nx')" && IFS="${IFS%x}";
echo ${#IFS}; # 1

and nothing will break in case of \b, test:

#!/bin/sh

sentence=$(printf "Foo\nBar\tBaz Maz\bTaz");
IFS="$(printf '\nx')" && IFS="${IFS%x}";

for entry in $sentence
do
    printf "Entry: ${entry}.\n";
done

gives two lines (due to one \n):

Entry: Foo.
Entry: Bar      Baz Maz Taz.

as expected.

IFS="$(printf '\nx')" && IFS="${IFS%x}"; using:

IFS="
"

gives the same result, but these two lines must not be indented, and if you accidentally put space or tab or any other white char between " and ", you'll no longer have only the \n char there, but some "bonuses" as well. This bug will be hard to spot, unless you use the option "show all characters" in your editor.