how to `tail` the latest file in a directory

In shell, how can I tail the latest file created in a directory?

tail `ls -t | head -1`

If you're worried about filenames with spaces,

tail "`ls -t | head -1`"

Do not parse the output of ls! Parsing the output of ls is difficult and unreliable.

If you must do this I recommend using find. Originally I had here a simple example merely to give you the gist of the solution, but since this answer seems somewhat popular I decided to revise this to provide a version that is safe to copy/paste and use with all inputs. Are you sitting comfortably? We'll start with a oneliner that will give you the latest file in the current directory:

tail -- "$(find . -maxdepth 1 -type f -printf '%T@.%p\0' | sort -znr -t. -k1,2 | while IFS= read -r -d '' -r record ; do printf '%s' "$record" | cut -d. -f3- ; break ; done)"

Not quite a oneliner now, is it? Here it is again as a shell function and formatted for easier reading:

latest-file-in-directory () {
    find "${@:-.}" -maxdepth 1 -type f -printf '%T@.%p\0' | \
            sort -znr -t. -k1,2 | \
            while IFS= read -r -d '' -r record ; do
                    printf '%s' "$record" | cut -d. -f3-
                    break
            done
}

And now that as a oneliner:

tail -- "$(latest-file-in-directory)"

If all else fails you can include the above function in your .bashrc and consider the problem solved, with one caveat. If you just wanted to get the job done you need not read further.

The caveat with this is that a file name ending in one or more newlines will still not be passed to tail correctly. Working around this problem is complicated and I consider it sufficient that if such a malicious file name is encountered the relatively safe behavior of encountering a "No such file" error will occur instead of anything more dangerous.

Juicy details

For the curious this is the tedious explanation of how it works, why it's safe and why other methods probably aren't.

Danger, Will Robinson

First of all, the only byte that is safe to delimit file paths is null because it is the only byte universally forbidden in file paths on Unix systems. It is important when handling any list of file paths to only use null as a delimiter and, when handing even a single file path from one program to another, to do so in a manner which will not choke on arbitrary bytes. There are many seemingly-correct ways to solve this and other problems which fail by assuming (even accidentally) that file names will not have either new lines or spaces in them. Neither assumption is safe.

For today's purposes step one is to get a null-delimited list of files out of find. This is pretty easy if you have a find supporting -print0 such as GNU's:

find . -print0

But this list still does not tell us which one is newest, so we need to include that information. I choose to use find's -printf switch which lets me specify what data appears in the output. Not all versions of find support -printf (it is not standard) but GNU find does. If you find yourself without -printf you will need to rely on -exec stat {} \; at which point you must give up all hope of portability as stat is not standard either. For now I'm going to move on assuming you have GNU tools.

find . -printf '%T@.%p\0'

Here I am asking for printf format %T@ which is the modification time in seconds since the beginning of the Unix epoch followed by a period and then followed by a number indicating fractions of a second. I add to this another period and then %p (which is the full path to the file) before ending with a null byte.

Now I have

find . -maxdepth 1 \! -type d -printf '%T@.%p\0'

It may go without saying but for the sake of being complete -maxdepth 1 prevents find from listing the contents of sub directories and \! -type d skips directories which you are unlikely to want to tail. So far I have files in the current directory with modification time information, so now I need to sort by that modification time.

Getting it in the right order

By default sort expects its input to be newline-delimited records. If you have GNU sort you can ask it to expect null-delimited records instead by using the -z switch.; for standard sort there is no solution. I am only interested in sorting by the first two numbers (seconds and fractions of a second) and don't want to sort by the actual file name so I tell sort two things: First, that it should consider the period (.) a field delimiter and second that it should only use the first and second fields when considering how to sort the records.

| sort -znr -t. -k1,2

First of all I am bundling three short options that take no value together; -znr is just a concise way of saying -z -n -r). After that -t . (the space is optional) tells sort the field delimiter character and -k 1,2 specifies the field numbers: first and second (sort counts fields from one, not zero). Remember that a sample record for the current directory would look like:

1000000000.0000000000../some-file-name

This means sort will look at first 1000000000 and then 0000000000 when ordering this record. The -n option tells sort to use numeric comparison when comparing these values, because both values are numbers. This may not be important since the numbers are of fixed length but it does no harm.

The other switch given to sort is -r for "reverse." By default the output of a numeric sort will be lowest numbers first, -r changes it so that it lists the lowest numbers last and the highest numbers first. Since these numbers are timestamps higher will mean newer and this puts the newest record at the beginning of the list.

Just the important bits

As the list of file paths emerges from sort it now has the desired answer we're looking for right at the top. What remains is to find a way to discard the other records and to strip the timestamp. Unfortunately even GNU head and tail do not accept switches to make them operate on null-delimited input. Instead I use a while loop as a kind of poor man's head.

| while IFS= read -r -d '' record

First I unset IFS so that the list of files is not subjected to word splitting. Next I tell read two things: Do not interpret escape sequences in the input (-r) and the input is delimited with a null byte (-d); here the empty string '' is used to indicate "no delimiter" aka delimited by null. Each record will be read in to the variable record so that each time the while loop iterates it has a single timestamp and a single file name. Note that -d is a GNU extension; if you have only a standard read this technique will not work and you have little recourse.

We know that the record variable has three parts to it, all delimited by period characters. Using the cut utility it is possible to extract a portion of them.

printf '%s' "$record" | cut -d. -f3-

Here the entire record is passed to printf and from there piped to cut; in bash you could simplify this further using a here string to cut -d. -3f- <<<"$record" for better performance. We tell cut two things: First with -d that it should a specific delimiter for identifying fields (as with sort the delimiter . is used). Second cut is instructed with -f to print only values from specific fields; the field list is given as a range 3- which indicates the value from the third field and from all following fields. This means that cut will read and ignore everything up to and including the second . that it finds in the record and then will print the remainder, which is the file path portion.

Having printed the newest file path there's no need to keep going: break exits the loop without letting it move on to the second file path.

The only thing that remains is running tail on the file path returned by this pipeline. You may have noticed in my example that I did this by enclosing the pipeline in a subshell; what you may not have noticed is that I enclosed the subshell in double quotes. This is important because at the last even with all of this effort to be safe for any file names an unquoted subshell expansion could still break things. A more detailed explanation is available if you're interested. The second important but easily-overlooked aspect to the invocation of tail is that I provided the option -- to it before expanding the file name. This will instruct tail that no more options are being specified and everything following is a file name, which makes it safe to handle file names that begin with -.

how to `tail` the latest file in a directory

Juicy details

Danger, Will Robinson

Getting it in the right order

Just the important bits

Related

Recent Posts