How does input redirection work?

Yes, I have already tried looking it up elsewhere, but the examples which are supposed to illustrate input redirection, like here for example, always have one confusing caveat. In the example of the site just posted, they say:

  # echo 'hello world' >output
  # cat <output

The first line writes "hello world" to the file "output", the second reads it back and writes it to standard output (normally the terminal).

However, cat output would do the exact same thing, no noeed for < here. So what is the difference??


Input redirection (as in cat < file) means the shell is opening the input file and writing its contents to the standard input of another process. Passing the file as an argument (as you do when running cat file) means the program you are using (e.g. cat) needs to open the file itself and read the contents.

Basically, command file passes a file to command while command < file passes the contents of a file to command. Yes, in cases like cat file vs cat < file there is no easily perceived difference in outcome, but but the two work in different ways.

To understand the difference, think of a young child and an adult. Both of them can drink water. However, the adult can open the tap and fill a glass (open the file and read its contents) while the child needs the water to be given to it directly (it can't open the file and can only process its contents).

Some programs, like cat, are capable of taking a filename as input and then opening the file and doing their thing on it. That's why cat file works. Other programs, however, don't have any knowledge of what files are or how to use them. All they know about is input streams (like the file's contents). For example, tr:

$ cat file
foo
$ cat file | tr 'o' 'b'  ## tr can read a stream
fbb
$ tr 'o' 'b' file  ## tr can't deal with files
tr: extra operand ‘file’
Try 'tr --help' for more information.
$ tr 'o' 'b' < file ## input redirection!
fbb

Another example is ls which can deal with files just fine, but ignores input streams:

$ ls
file1  file2
$ ls file1   ## lists only file1: ls takes file names as arguments
file1
$ ls < file1 ## ls ignores its standard input, this is the same as ls alone
file1 file2

Other programs can't deal with streams and instead require files:

$ rm < file ## fails, rm needs a file 
rm: missing operand
Try 'rm --help' for more information.
$ rm file ## works, file is deleted

Some programs can deal with both opening files and reading input streams but behave in different ways with each. For example, wc which, when given a file to open, prints the name of the file as well as the number of lines, words and characters:

$ wc file
1 1 4 file

But, if we just give it a stream, it has no way of knowing that this is coming from a specific file so no file name is printed:

$ wc < file
1 1 4

The md5sum command behaves similarly:

$ md5sum file
17fd54512c91e3cd0f70fbaaa9a94d0d  file
$ md5sum < file
17fd54512c91e3cd0f70fbaaa9a94d0d  - 

Note that in the first case the file name file is shown while, in the second, "filename" is -: standard input.


Now, if you want more gritty details, you can use strace to see exactly what's going on:

strace -e trace=open,close,read,write wc file 2>strace1.txt

and

strace -e trace=open,close,read,write wc < file 2>strace2.txt

Those will have all the details of all open(), close() and read() operations run by the process. What you want to see is that strace1.txt (when the file was passed as an argument and not with input redirection) contains these lines:

open("file", O_RDONLY)                  = 3
read(3, "foo\n", 16384)                 = 4

Those mean that the file file was opened and attached to the file descriptor 3. Then, the string foo\n was read from 3. The equivalent part of the strace output when using input redirection is:

read(0, "foo\n", 16384)                 = 4

There is no corresponding open() call, instead the string foo\n is being read from 0, the standard input1.


1By default, 0 is standard input, 1 is standard output and 2 is standard error. This, by the way, is why file was opened as 3, that was the next available one.


Basically the differences are:

  1. cat output.txt: reads the contents of output.txt file to standard output directly

  2. cat < output.txt: the output (or contents) of output.txt via the redirect standard input symbol (<) is read by the cat command. Hence output.txt is used an input for < command.

The output for both methods will be the same but an extra path is taken in the second method as a result of the < redirect standard input symbol.