Why is the current directory in the ls command identified as linked to itself?
In the book "Learning the UNIX operating system", there is a section: "3.1.8 Listing Files", that describes the ls
command.
In the paragraph on ls -l
it describes the columns of the output of this command.
The second column of the ls -l
command contains a single number. This number is in the book described as "The number of files and directories linked to this one." ( linked to the file or directory named in the last column of the same row as the concerned number. )
I tried this command and compared the output with the actual amount of files and directories in the current directory.
ls -l
drwxr-xr-x 6 azbc staff 192 Sep 7 16:09 test
In the directory test
, I have 2 subdirectories and 1 file, and 1 hidden file and a listing of the current directory, plus a listing of the parent directory, thus together 6 files and directories.
ls -a -F
./ .hidden_file.txt dir_2/
../ dir_1/ file_1.sh
It seems logical to me to identify all files and directories (including hidden files and directories) as linked to the current directory. It also seems logical to identify the parent directory as linked to the current directory.
But why is the current directory identified as linked to itself ?
The ls -la
command for the test directory gives the following output. ( the -F option shows a /
in case of a directory behind the directory name, and a * in case of an executable)
ls -la -F
total 0
drwxr-xr-x 6 azbc staff 192 Sep 7 16:09 ./
drwxr-xr-x+ ?? azbc staff ?? Sep 7 16:06 ../
-rw-r--r-- 1 azbc staff 0 Sep 7 16:09 .hidden_file.txt
drwxr-xr-x 2 azbc staff 64 Sep 7 16:06 dir_1/
drwxr-xr-x 2 azbc staff 64 Sep 7 16:06 dir_2/
-rwx--x--x 1 azbc staff 0 Sep 7 16:06 file_1.sh*
A file itself is identified with only one link. Is the file linked to itself ? Or is it linked to the directory it is in ?
Since in the listing of a directory the directory itself is represented in the listing and therefore logical to be counted as a link.
However in the listing of a file itself there is only the file itself represented in the listing.
ls -la -F file_1.sh
-rwx--x--x 1 azbc staff 0 Sep 7 16:06 file_1.sh
That makes it logical to say that the file is linked to itself.
However it seems more logical to me to say that the file is linked to the directory it is in.
This seems not consequent.
Or is the listing of the linked files merely a counting of of the files and directories present in the listing output of the command, and not an identification of the real links to the file or directory in the the file system ?
Edit: as reply to @George Udosen, on:
"Now to try and answer your query in the comment:
'What is here being listed as a link ? Is the file itself listed ? Or is the directory that contains the file being listed ?'"
If I list the directory test
:
ls -la -F test
...
drwxr-xr-x 2 azbc staff 64 Sep 7 16:06 dir_1/
...
-rwx--x--x 1 azbc staff 0 Sep 7 16:06 file_1.sh*
it identifies the directory dir_1
with 2
links !
If I then list that directory test/dir_1
ls -la -F test/dir_1
total 0
drwxr-xr-x 2 azbc staff 64 Sep 7 16:06 ./
drwxr-xr-x 9 azbc staff 288 Sep 7 21:37 ../
Hey, indeed !! it lists 2
entries !
The file file_1.sh*
was identified with 1
link.
If I list the file file_1.sh
ls -la -F test/file_1.sh
-rwx--x--x 1 azbc staff 0 Sep 7 16:06 test/file_1.sh*
Ho !! it lists indeed 1
entry !! , namely file_1.sh
itself ! and again identifies that file with 1
entry.
By the way from this can I conclude that every entry listed having 1
link
is a file and not a directory ? Ho, this seems not to be the case as symbolic links are also listed as having 1
link / 1
entry.
I recommend you read What are directories, if everything on Linux is a file? for more in-depth knowledge on directory structure, history, and terminology of how directories work and its elements (inode, dirent
structure, etc.), although it's not required for this question.
What are dot '.' and dot-dot '..' directories ?
Looking at format of directories
manual page from 1971 edition of UNIX programmer's manual, we see that .
and ..
were already there:
By convention, the first two entries in each directory are for "." and “..“. The first is an entry for the directory itself.
As for their significance, an answer can be found on Panos's answer. Ken Thompson explained how ..
came about in the 1989 interview:
Every time we made a directory, by convention we put it in another directory called directory - directory, which was dd. Its name was dd and that all the users directories and in fact most other directories, users maintain their own directory systems, had pointers back to dd, and dd got shortened into �dot-dot,� and dd was for directory-directory. It was the place back to where you could to get to all the other directories in the system to maintain this spaghetti bowl
Naturally, .
as you can guess stands for d
or short of directory
. Such directory itself naturally shares same inode number as the directory's actual name. Now, this still doesn't explain why the directory .
is linked to itself, but I have couple ideas.
0. Unix Philosophy:
In the 1996 book "UNIX Internals: The NEw Frontiers" by Uresh Vahalia, in Chapter 8, page 222 it is stated:
Unix supports thenotion of a current working directory for each process, maintained as part of the process state. This allows users to refer to files by their relative pathnames,which are interpreted relative to the current directory.
Considering that a directory is just a special file, we need consistent relative filename to refer to directory itself and that would be a special filename .
, which evolved from d
, which was short for directory.
1. Technical advantages
Main advantage I could think of is for the system to simplify the inode lookup, and thus metadata information. Since directory already has an entry containing .
with the same inode, there's no need to query via full path. Same goes for programming. Consider a very simple implementation of ls
. There I use getcwd()
function to obtain current working directory path, and then pass it to opendir()
. Or I could throw away getcwd()
and just use opendir('.')
directly. In the day of old PDP-11 terminals where memory size was in few kilobytes, saving up on syscall overhead would be crucial.
2. User convenience:
Consider the following example:
mv ../filename.txt .
In the presentation by Hendrik Jan Thomassen it's been mentioned that original Unix commands were short due to old terminal keys being hard to press, thus it was a physical effort to actually type commands all day long. If you are deep into directory tree, retyping full path of the current working directory would be tedious. Of course, mv
could be implemented with assuming that when we do mv <file>
we imply destination as "current working directory". I can only guess as to why mv <original> <new>
prevailed, perhaps due to influence of other programming languages of the day.
3. Improving over MULTICS:
Note: I've never worked on MULTICS myself, so this is based on reading online sources only
According to 1986 MULTICS manual on Pathnames:
A relative pathname may begin with one or more less-than characters ("<").
The >
character is used on MULTICS as path separator (like /
on Linux). Arguably this may look confusing. Thus, ./
when referencing a command is arguably clearer - we're referencing a filename that is located in current working directory.
This may be beneficial for other commands. It's well known how to create a file on Unix/Linux: touch ./file
. On MULTICS, according to swenson.org is done via an
or add_name
command:
cd foo
r 18:03 0.041 1
an foo bar
r 18:03 0.077 3
ls foo
Directories = 1.
sma foo
bar
r 18:03 0.065 0
On side note, there's obvious similarity when it comes to ..
: navigating up one directory is done via cwd <<
.
4. Referencing executables
If you're running scripts on daily basis, you know well ./script.sh
syntax. The why of it is simple: the way shell works is that it looks for executable files in PATH
variable so when you provide ./
it doesn't have to look anywhere. The magic of PATH
variable is what makes you use echo
instead of /bin/echo
or other very lengthy paths. Now lets say you don't have that script.sh
in your path, and it's there in your current working directory. What do you do now ? Type /very/long/path/to/the/executable/this/typing/gets/exhausting/on/PDP-11/finally/script.sh
? This will throw away all concept of Unix simplicity ! So going back to the Unix philosophy, it also aligns with the principle of elegant design/simplicity.
Of course, some folks want to add .
to PATH
, but this is actually a very bad practice, so don't do that.
side note: The special case of ..
and .
pointing to the same is inode 2 - the /
dir , and it makes sense since it is highest point in directory tree. Of course, ..
being NULL could also work, but it's more elegant to make it point to /
itself.
Note on Link Count and Directory Hardlinks
As Gilles properly pointed out (and referenced by George Udosen) , the link count for a directory starts with 2 ( ..
for parent directory and .
), with all additional link being a subdirectory:
# new directory has link count of 2
$ stat --format=%h .
2
# Adding subdirectories increases link count
$ mkdir subdir1
$ stat --format=%h .
3
$ mkdir subdir2
$ stat --format=%h .
4
# Adding files doesn't make difference
$ cp /etc/passwd passwd.copy
$ stat --format=%h .
4
# Count of links for root
$ stat --format=%h /
25
# Count of subdirectories, minus .
$ find / -maxdepth 1 -type d | wc -l
24
Intuitively, links of a directory being subdirectories only - makes sense, since hard links are of the same time as original file. Except, these aren't exactly hard links - hard links create a filename that points to same data. By that definition, a hard link to directory would contain same data, i.e. contain same listing of files. This would lead to loops in filesystem or lots of orphan files if all hard links to directory were removed. For that reason, hard link creation is not allowed for directories, and to use Gilles's phrasing from another question (which I recommend you read) "...[i]n fact, many filesystems do have hard links on directories, but only in a very disciplined way..." and those are the special cases of .
and ..
directories.
Now, question becomes what is actually meant by "links" in context of directories ? TL;DR: directory structure is a tree, and Links here means number of child nodes for each tree item (with each leaf, or directory without subdirs, having only 2 links). In particular, ext3 and ext4 use HTree and xfs uses B+ tree
Conclusion
In the end, the reason why .
is linked to itself is simply because it's good design. Original authors of Unix may have been working under technological constraints of their time, but they were some of the most brilliant minds of the day, or as often they're called "Wizards", and they did things for a reason.
Your question is fuzzy to me but I try to explain how things work so it might help you to understand it.
Every file stored in system has a number (inode number), let's check it out:
$ ls -i -1 -a test/
9186865 .
9175041 ..
I used -1
to show the list of files in a single-column and -i
for showing the inodes and -a
to show hidden files.
Each inode
keeps information about the files, things like permissions, owner, size, number of links, modification time, pointers to the actual files data (but not the name of the file).
Each directory is nothing more that of a special file, containing a list of names (files) and corresponding inodes to those names.
So when I remove a file (also known as unlinking a file) I'm removing its link from its parent directory, but the data still lives on the disk.
When you create a new directory by default it contains 2 hard link, means each directory by default has .
and ..
in its list.
And as you might know .
is a hard link to current directory and ..
is a hard link to the parent directory, so if I create a new directory:
$ mkdir test
$ ls -i -d test
9186865 drwxrwxr-x 2 ravexina ravexina 4096 Sep 7 19:37 test
As you can see the number of links are two, now it doesn't matter how many file I create in this directory the number of links stay the same unless I start creating directories. for each directory the number will be incremented by 1 and now you know why! because each new directory contains a hard link to its parent: ..
:
Remember what I said about directories?
Each directory is nothing more that of a special file, containing a list of names (files) and corresponding inodes to those names.
The links are actually these names, each file by default has 1 link (its name while being created) now if you create a new hard link to this file (means an other name, in another directory or same directory which points to same data [inode]) the number will be incremented by 1.